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Preface 



These proceedings contain a selection of refereed papers presented at or related 
to the 3rd Annual Workshop of the Types Working Group (Computer- Assisted 
Reasoning Based on Type Theory, EU 1ST project 29001), which was held dur- 
ing April 30 to May 4, 2003, in Villa Gualino, Turin, Italy. The workshop was 
attended by about 100 researchers. Out of 37 submitted papers, 25 were selected 
after a refereeing process. The final choices were made by the editors. 

Two previous workshops of the Types Working Group under EU 1ST project 
29001 were held in 2000 in Durham, UK, and in 2002 in Berg en Dal (close 
to Nijmegen), The Netherlands. These workshops followed a series of meetings 
organized in the period 1993-2002 within previous Types projects (ESPRIT 
BRA 6435 and ESPRIT Working Group 21900). The proceedings of these ear- 
lier workshops were also published in the LNCS series, as volumes 806, 996, 
1158, 1512, 1657, 2277, and 2646. ESPRIT BRA 6453 was a continuation of 
ESPRIT Action 3245, Logical Frameworks: Design, Implementation and Expe- 
riments. Proceedings for annual meetings under that action were published by 
Cambridge University Press in the books “Logical Frameworks”, and “Logical 
Environments”, edited by G. Huet and G. Plotkin. 

We are very grateful to the members of the research group “Semantics and 
Logics of Computation” of the Computer Science Department of the University 
of Turin, who helped organize the Types 2003 meeting in Torino. We especially 
want to thank Daniela Costa and Claudia Goggioli for the secretarial support, 
Sergio Rabellino for the technical support, and Ugo de’ Liguoro for helping out 
in various ways. 

We also acknowledge the support from the Types Project, EU 1ST 29001, 
which makes the Types workshops possible. 
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A Modular Hierarchy of Logical Frameworks 



Robin Adams 

University of Manchester 
robin . adamsOma . man .ac.uk 



Abstract. We present a method for defining logical frameworks as a 
collection of features which are defined and behave independently of one 
another. Each feature is a set of grammar clauses and rules of deduction 
such that the result of adding the feature to a framework is a conservative 
extension of the framework itself. We show how several existing logical 
frameworks can be so built, and how several much weaker frameworks 
defined in this manner are adequate for expressing a wide variety of 
object logics. 



1 Introduction 

Logical frameworks were invented because there were a large number of differing 
systems of logic, with no common language or environment for their investigation 
and implementation. However, we now find ourselves in the same situation with 
the frameworks themselves. There are many systems that are used as logical 
frameworks, and it is often difficult to compare them or share results between 
them. It is often much work to discover whether two frameworks can express the 
same class of object logics, or whether one is stronger or weaker than the other. If 
we are interested in metavariables, and we compare Pientka and Pfenning’s work 
[1] with Jojgov’s [2], it is difficult to see which differences are due to the different 
handling of metavariables, and which are due to differences in the underlying 
logical framework. 

To redress this situation somewhat, I humbly present the first steps towards 
a common scheme within which a surprising number of different frameworks 
can be fitted. We take a modular approach to the design of logical frameworks, 
defining a framework by specifying a set of features, each of which is defined and 
behaves independently of the others. Together, all the frameworks that can be 
built from a given set of features form a modular hierarchy of logical frameworks. 

We may give an informal definition of a feature thus: 

A feature F is a set of grammar clauses and rules of deduction such that, 
for any logical framework L, the result of adding F to L is a conservative 
extension of L. 

(This cannot be made a formal definition, as we do not (yet) have a notion of 
“any logical framework” .) 
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It is not surprising that features exist — one would expect, for example, that 
adding a definitional mechanism to a typing system should yield a conserva- 
tive extension. Perhaps more surprising is the fact that such things as lambda- 
abstraction can be regarded as features. In fact, we shall show how a logical 
framework can be regarded as being nothing but a set of features. More pre- 
cisely, we shall define a system that we call the basic framework BF, and a 
number of features that can be added to it, and we shall show how a number of 
existing frameworks can be built by selecting the appropriate features. 

We shall also show that most of these features are unnecessary from the 
theoretical point of view — that is, a much smaller set of features suffices to 
express a wide variety of object logics. These ‘unnecessary’ features may well be 
desirable for implementation, of course. 

It may be asked why we insist that our features always yield conservative 
extensions. This would seem to be severely limiting; in one’s experience with 
typing systems, rarely are extensions conservative. For typing systems in general, 
this is true. But I would argue that logical frameworks are an exception. The 
fact that all the features presented here yield conservative extensions is evidence 
to this effect. And it would seem to be desirable when working with a logical 
framework - if we add a feature to widen the class of object logics expressible, 
for example, we still want the old object logics to behave as they did before. 

We suggest that, if this work were taken further, it would be possible and de- 
sirable to define mechanisms such as metavariables or subtyping as features, and 
investigate their properties separately from one another and from any specific 
framework. If we did this for metavariables, for example, we would then know im- 
mediately what the properties of ELF with metavariables were, or Martin-Lof ’s 
Theory of Types with metavariables, or . . . 



2 Logical Frameworks 

Let us begin by being more precise as to what we mean by a logical framework. 

Broadly speaking, logical frameworks can be used in two distinct ways. The 
first is to define an object logic by means of a signature , a series of declarations 
of constants, equations, etc. The typable terms under that signature should then 
correspond to the terms, derivations, etc. of the object logic, using contexts to 
keep track of free variables and undischarged hypotheses. Examples include the 
Edinburgh Logical Framework [3] and Martin-Lof ’s Theory of Types [4]. We 
shall call a framework used in this way a logic-modelling framework. 

The second is to use the logical framework as a book-writing system, as 
exemplified by the AUTOMATH family of systems [5]. The most important 
judgement form in such a framework is that which declares a book correct; the 
other judgement forms are only needed as auxiliaries for deriving this first form 
of judgement. 

These two kinds of system behave in very similar ways. Any system of one 
kind can be used as a system of the other, by simply reading ‘signature’ for 
‘book’, or vice versa. This is a striking fact, considering the difference in use. In 
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a system of the first kind, deriving that a signature is valid is just the first step 
in using an object logic; in a book-writing system, it is the only judgement form 
of importance. We shall take advantage of this similarity. Our features shall be 
written with logic-modelling frameworks in mind; it shall turn out that they are 
equally useful for building book- writing frameworks. 

We consider a logical framework to consist of: 

1. Disjoint, countably infinite sets of variables and constants. 

2. A number of syntactic classes of expressions, defined in a BNF-style grammar 
by a set of constructors , each of which forms a member of one class from 
members of other classes, possibly binding variables. 

3. Three syntactic classes that are distinguished as being the classes of signa- 
ture declarations, context declarations and judgement bodies. Each signature 
declaration is specified to be either a declaration of a particular constant, 
or of none. Similarly, each context declaration is specified to be either a 
declaration of a particular variable or of none. 

We now define a signature to be a finite sequence of signature declarations, 
such that no two declarations are of the same constant. The domain of the 
signature S, domT, is then defined to be the sequence consisting of the 
constants declared in E, in order. Similarly, we define a context to be a 
finite sequence of context declarations, no two of the same variable, and we 
define its domain similarly. 

Finally, we define a judgement to be a string of one of two forms: either 

A 1 sig 
or 

r \~s J 

where E is a signature, r a context, and J a judgement body. 

4. A set of defined operations and relations on terms. Typically, these shall 
include one or more relations of reducibility and convertibility. 

5. The final component of a logical framework is a set of rules of deduction 
which define the set of derivable judgements. 



2.1 The Basic Framework BF 



As is to be expected, BF is a very simple system. It allows: the declaration 
of variable and constant types; the declaration of variables and constants of a 
previously declared type; and the assertion that a variable or constant has the 
type with which it was declared, or is itself a type. 

The grammar of BF is as follows: 



Term 

Kind 

Signature Declaration 
Context Declaration 
Judgement Body 



a ::= x \ c 
A ::= Type | El(a) 

S ::= c : A of c 
7 ::= x : A of x 
J ::= valid | A kind | a : A 



The rules of deduction of BF are given in Figure 1. 
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0 sig 

E sig 
\~e valid 
F \~e valid 

(c:AeE) 

r \~s c : A 

F \~e valid 

F \~e Type kind 



b e A kind 

(c dom E) 

E,c : A sig 

F \~e A kind 

( x ^ domT) 

F, x : A b e valid 

F \~e valid 

( x-.Aer ) 

r\~E x : A 

r \~e a : Type 

r \~e El(a) kind 



Fig. 1. The basic framework BF 

3 Features and the Modular Hierarchy 

A feature that depends on the logical framework L consists of any number of 
new entities: new syntactic classes, new constructors, new defined operations and 
relations and new rules of deduction. The new constructors may take arguments 
from new classes or those of L , bind new variables or those of L , and return 
expressions in new classes or those of L. In particular, they may create new 
signature declarations, context declarations and judgement bodies. Likewise, the 
new defined operations and relations should be defined on both old and new 
expressions, and the new rules of deduction may use both old and new judgement 
forms. 

A feature may also introduce redundancies. A redundancy takes an old con- 
structor and declares that it is to be replaced by a certain expression. That is, 
the constructor is no longer part of the grammar; wherever it appeared in a 
defined operation or relation or a rule of deduction, its place is to be taken by 
the given expression. 

Now, if L ’ is any logical framework that extends L, we define the logical 
framework L' + F in the obvious manner. 

It should be noted that these rules of deduction are assumed to automatically 
extend themselves when future features are added. For example, if a feature 
contains the rule of deduction 

r \~z M : A 
r \- s M = M : A 

and we later introduce a new constructor for terms M, this rule is assumed to 
hold for the new terms M as well as the old. 

(Formally defining features in such a way that this is possible requires ex- 
plicitly defining classes of meta-expressions in the manner of [6]. We shall not 
go into such details here.) 

Finally, we define: 
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Definition 1. A feature F that depends on the set of features {Fi,F 2 , . . . } is a 
feature that depends on the logical framework 



BF + Fi + F ‘2 + ■ ■ ■ . 



Thus, if F depends on {F\, F 2 , . . . }, we can add F to any framework in the 
hierarchy that contains all of Fi, F 2 , ... . Note that we do not stipulate in this 
definition whether the set {Fi, F 2 , ■ ■ ■ } is finite or infinite. 



3.1 Parametrization 

The first, and most important, of our features are those which allow the decla- 
ration of variables and constants with parameters. This mechanism is taken as 
fundamental by the systems of the AUTOMATH [5] family as well as PAL + [9] , 
but can be seen as a subsystem of almost all logical frameworks. Parametriza- 
tion provides a common core, above which the different forms of abstraction 
(A-abstraction with typed or untyped domains, and with /3- or /^-conversion, 
as well as PAL + -style abstraction by let-definition) can be built as conservative 
extensions. 

We define a series of features: SPar(l), SPar(2), SPar(3), ... , and also 
LPar(l), LPar(2), LPar(3), .... These extend one another in the manner 
shown in Figure 2 . 



BF 



LPar(l) c -*- LPar (2) 



SPar(l) c_- SPar (2) 



LPar (ui) 



SPar (u>) 



Fig. 2. The initial fragment of the modular hierarchy 



BF already allows the declaration of constants in kinds: ci : A. 

LPar (1) allows the declaration of constants in first-order kinds : c 2 : ( X\ : 
Ai, . . . ,x n : A n )A. This declaration indicates that c is a constant that takes 
parameters x\ of kind A±, ... , x n of kind A n , and returns a term c 2 [xi,. . . , x n ] 
of kind A. 

LPar (2) allows the parameters themselves to have parameters: C 3 : (£1 : 
(xn . An,... • AuJAi, . . . , x n - ( x n i • A n 1 , . . . , x n k n . A n k n ) A n ) A. 

LPar (3) allows these second-order parameters to have parameters, and so on. 
Similarly for declaration of variables. 

We also define the feature LPar (cc) to be the union of all these features, 
allowing any level of parametrization. 
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The sequence of features SPar (n) is similar; the only difference is that, in 
SPar (n), every parameter must be in a small kind; that is, each Ai, Ay, . . . 
above must be of the form El(a); it cannot be Type. (In SPar(n), A itself, 
the rightmost kind, can be Type in the declaration of a constant, but not in a 
declaration of a variable.) 

The full details of these features are as follows: 



Parameters in Small Kinds, SPar (n) 

Grammar Before we can introduce the new grammar constructors, we need to 
make a few definitions. 

We define an m-tlr order pure context by recursion on m as follows. An m-th 
order pure context is a string of the form 



( xi : (Ai) El(ai), ... , x k : (A fc ) El(a fc )) 



where each Xi is a variable, all distinct, A, a pure context of order < to, and a* 
a term. Its domain is (aq, . . . , aq). 

We define an abstraction to be a string of the form 

[a?] M 

where a; is a sequence of distinct variables, and M a term. We take each member 
of x to be bound within M in this abstraction, and we define free and bound 
variables and identify all our expressions up to a-conversion in the usual manner. 
We write ~M, W, ... for arbitrary abstractions. It is important to note that these 
are not first-class objects of every framework that contains SPar (n). 

Now, we add the following clause to the grammar: 

z[M\ 

is a term, where 2 is a variable or constant, and a sequence of abstractions.. 
This clause subsumes the grammar of BF, for a;() and c() are terms when x is 
a O-ary variable and c a O-ary constant. 

We also allow declarations of the form 

c : {A) A 

in the signature, where c is a constant, A a pure context of order < n, and A a 
kind; and those of the form 



x : (A) El(a) 

in the context, where a: is a variable, A a pure context of order < n, and a a 
term. Again, these subsume those of BF. 
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Defined Operations We define the operation of instantiation as follows. This 
operation takes the place of substitution; we cannot substitute for a variable of 
kind (A) A, as we have no first-class objects in such a kind — indeed, we have 
no such kind yet. But it is possible define a term 

fMi/xi, . . . ,"M m /x m }M 

where M is a term, and, for i = 1 , . . . , m, Mj = [yi\Mi is an abstraction, and 
Xi a variable, in such a way that first-class abstractions are never needed. They 
can, later, be added as a conservative extension. It may aid the understanding 
of the definition of instantiation to note that, once abstractions are added, 

{"Mi/xi, . . . ,"M m /xm} is the normal form of . . . ,"M m /x m ] 

The definition is as follows: 

{~M/x}z(fN) = z({"M /x}"N) 

(if z is a constant or a variable not in x) 
f M/x}xiCN ) = {{-M/xYN/ yi }Mi 

We also need a defined judgement form. If ~M is an abstraction sequence, 
and A a pure context, we define a set of judgements 

r\h s ~M :: A 

(read: under signature S and context T, ~M satisfies A). The definition is as 
follows. Let 



A = x i : (Zii) El(ai) , . . . , XfYi . 

m ) El( Q"m) 

and let 



"Mi = [ yi] Mi . 

We take T lb.*; "M :: A to be defined only when yi = Z\,; for all i. 

r\\- E "M :: A 

is the following set of judgements: 

r,A x \~s Mi : El(oi) 
r, f M 1 /x 1 }A 2 \~ jj M 2 : EKfMr/sijaa) 
r,{~M 1 /x 1 ,~M 2 /x 2 }A 3 \- E M 3 : E\({"M 1 /x 1 ,"M 2 /x 2 }a 3 ) 



r, {"Mi/xi, . . . fMm-i / x m -i} A m b % M m : El({Mi/a;i, . . . ,'M m _i/a: m _i}a m ) 

In the case m = 0, we take T lb ^ "M :: A (i.e. r Ib^; () :: ()) to be the single 
judgement 



r \~s valid . 
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Rules of Deduction The rules of deduction in SPar (n) are as follows: 



A \~z valid 

(c 4: dom E) 

S,c: (A) Type sig 



A h s a : Type 
E, c : (A) El(o) sig 



(c ^ dom if) 



r, A \- E a : Type 
r, x : (A) El(a) valid 



(x dom r) 



r lb S ~M :: A 

(c : (A) A G E) 

r \- s c[M ] : {~M / dom A} A 



r \\- s ~M :: A 

(x : (A) El(a) G T) 

r\- E x[M] : El(fAf/ dom A}a) 



Finally, SPar (uj) is defined to be the union of all the features SPar (n). 



Parameters in Large Kinds, LPar (n) The features LPar (n) and LPar (uj) 
are defined in exactly the same manner as SPar (n) and SPar (uj), with only two 
differences. The first is the definition of pure context, which now allows Type 
to appear: 

An m-th order pure context is a string of the form 



(xi : (Ai)Ai, ... ,Xk : (A k )A k ) 



where each Xi is a variable, all distinct, A, is a pure context of order < m, and 
Ai is either Type or El(aj) for some term aj. 

The second is that large kinds are permitted in context declarations as well 
as signature declarations; that is, we allow declarations of the form x : (A) A in 
the context, where a: is a variable, A a pure context of appropriate order, and A 
either Type or El(a) for some term a. 

3.2 Lambda Abstraction 

We can now, if we wish, build in traditional A-abstraction. It should be noted 
that this does not change the class of object theories that can be expressed by 
the framework. 

We can make these abstractions typed or untyped (i.e. explicitly include the 
domain or not), and we can choose to use (3 or /^-conversion. These two choices 
lead to four features that can be added to a framework. We shall denote them 
Ajj, AJj*, Ajj , \'ff rj . We shall give here the details of A^; the others are very similar. 

We shall describe here a feature A^ to be built on top of BF + LPar (w). 
It would be easy to change the details to give a feature that could be added to 
BF + LPar (n), BF + SPar (n), or BF + SPar (uj). 
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We add the following clauses to the grammar: 

Term M ::=••• | [x : A]M | M[M] 

Kind A • • • | (x : A)A 

There are two redundancies in the feature A^. The first: let c be a constant, 
where Let 

c: (x i : (A\)Ai , . . . , Xfji . (4 m )A m )A 
be in the signature, where 

Ai = (Xu . • - • , Xik i . [Aik^j Aik^ Ai . 

Then we identify the term 

c[[®i]Mi, . . . , [: x m \M m ] 

with the base term 

c[{x\i : (dn)dn] • • • [in, : (Aik^Aik^Mi] ■ ■ ■ 

[[^ml : (A rn i^A m i] • • • \x rn k m : 

The second is a similar redundancy for terms beginning with a variable. 

We define the relations of /3-reduction, /3-conversion, etc. on our classes of 
terms in the usual manner, based on the contraction 

{{x : (A)A]M)[N] -^ 0 [N/x\M . 

The rules of deduction in A^ are now: 

r,x: A by B kind 

r by (x : A)B kind 



by A kind 
A7, c : A sig 



(c ^ dom S) 



r by A kind 

(x 4. dom r) 

r, x : A by valid 



r,x : A by M : B 
r by [x : A]M : (x : A)B 



r by M : {x : A)B T by N : A 
r by M[N } : [N/x\B 



r by M : A r by B kind 
r by M : B 



(A = p B) 



3.3 Other Features 

We present a summary of other features in Figures 3 and 4. Each of these features 
depends on SPar (w). It would be easy enough to write a version dependent on 
SPar ( n) for some finite n. 
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Global Definition of Constants, cdef Depends on SPar (w). 



Signature Declaration 7 ::= ■ • • | c“[/i“] := M : A 
If c[/A] := M : A is in the signature, the following is a reduction rule: 

c\~N] ~^>6 C {l/V/domA} 



Ah E M-.A r\~s M : A r \- s B kind 

(c £ dom E) (r \~ s A = Sc B) 

E, c[A] := M : A sig F\~ s M : B 



rti-z'N :: A 

(c[A\ := M : A € E) 

r \~s c[~AT] : {'“IV / dom A} A 

Global Definition of Variables, vdef Depends on SPar (w). 



Context Declaration 8 ::= ■ ■ ■ \ x a [A a ] := M : A 
If x a [A a ] := M 13 : A 13 is in the context, the following is a reduction rule: 

rrfMV] {"AT/domzi}M 



r, A M : A r \- s M : A T \- s B kind 

(x £ dom B) {B\-s a =g v b) 

r, x[A] :=M : A h s valid M : B 



r\\- s ~N :: A 

(z[Al := M :Aer) 

r \~s a;[~ZV] : {“iV/dom A} A 

Local Definitions, let Depends on vdef. 



Term M ::=••• | leta:“[z4 Q ] := M : AinM 
Kind A ::=••• | let x a [Z\ Q ] := M : A in A 

letv[Z\] = M : Ain N -~>s {[dom A]M/v}N 
lett<[zi] = M : A in K ~-~*s {[dom A]M/v}K 



B, v[A\=M : A \- s I< kind r \~s M : A B \~s B kind 

{A = t B) 

F hi; letn[A] = M : Ain A kind A hi; M : B 

r, v[A] = M : A \~s N : K 

F \~s let v[A] — M : A in N : let v[A] = M : A in K 

Fig. 3. Miscellaneous features 
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Judgemental Equality, eq Depends on SPar (u>). 

Judgement body J ::= ■ ■ ■ \ M = M : A \ A = A 

Signature declaration S ::= • • • | ( A)(M = M : A) 

r He M : A rv- s N-.A F \-£ A kind F \-£ B kind 

(M = N) 

F \~s M = N : A F \~z A = B 



A \ M : ,1 A\~z N : A 



E, (A)(M = N : A) sig 



F\\-z~P :: A 

(( A)(M = N : A) G E) 

F !-•>; {fP / dom A}M = {~P/ dom A}N : fP/dom A} A 



r 




M = 


= N : 


A 




Fh s 


fe- 

ll 

£ 


A 


r \~z N = P : A 


F 




N = 


M : 


A 






r \~z 


M 


a, 

II 






F\- 


E A 


= 


B 


F h 


z A = B 


I 


hzB = C 






r h 


B B 


= 


A 




r\-£ 


A : 


= C 


Fh E 


■ M 


: A 


F b 


£ 


A 


= B 


r \- s m 


= 


N : A F \- s A = B 



r \~s M : B r\~z M = N : B 



Fig. 4. Miscellaneous features 



3.4 Conservativity Results 

The guiding principle behind the modular hierarchy is that the features are 
defined, and behave, independently of one another. The formal result that cor- 
responds to this principle is: 

Theorem 1 . If L is a logical framework in the hierarchy, and F a feature such 
that every feature on which F depends is present in L, then L + F is a conser- 
vative extension of L. 

This theorem can be proven for the finitely many features we have presented 
in this paper. We prove that, if J is a judgement of L derivable in L + F, then 
J is derivable in L , by direct induction on the derivation of .7 in L + F. The 
only non-trivial cases are the conversion rules; these require the Church-Rosser 
property to be proven for the set of typable terms. This is never too demanding; 
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even the case of /^-conversion can be handled using, for example, the techniques 
of [7], because the frameworks, as type systems, are very simple: there is a single 
predicative universe and no reflection. 



4 Existing Logical Frameworks 

We show here how several existing logical frameworks are equivalent to systems 
that are built out of the features we have introduced. As well as the frameworks 
we have already mentioned, we deal with Luo’s frameworks LF [8]. 



PAL = BF + LPar (1) + cdef 
AUT-68 ~ BF + SPar (w) + A^ + LPar (1) + cdef 
AUT-QE ~ BF + LPar (w) + + cdef 

ELF = BF + SPar (w) + A^ 

Martin-Lof’s Theory of Types = BF + LPar (w) + A'g f ?J + eq 

LF = BF T LPar (u>) T Alj ^ T eq 



(Note: in the second line, the version of A^ included is built on top of SPar (w) 
only, not LPar (1). AUT — 68 allows the declaration of constants with first-order 
parameters, but does not allow such lambda-abstractions to be formed.) 

The notion of equivalence with which we are working is the possiblity of 
defining a translations between the members of the syntactic classes of the two 
frameworks, such that the translate of each rule of deduction of one is admissible 
in the other, and which are inverses of one another up to the relevant notion of 
convertibility within each framework. 

For the lines in which we have used an equality sign, such translations can 
be given; the correspondence between the existing framework and the one pro- 
duced by the hierarchy is fairly close. For the first two ‘AUT-’ frameworks, the 
correspondence is not nearly as neat. There is a correspondence between the hi- 
erarchy framework and a variant of the AUTOMATH framework. This variant 
removes the distinction between, for example, the constant defined by 

(0,x,-, A), (x,c,PN,H) 

(defining c with parameter x : A inside the kind B ) and that defined by 

(0, c, PN, [x : A]B) 

(defining c with no parameters inside the kind [x : A)B). It also replaces AU- 
TOMATH’s system of declaring variables with a more orthodox system of con- 
texts. 

It is possible to make the correspondence in these two cases better; and it 
is also possible to tighten the other four, so that the translations are inverses 
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up to identity (that is, a-conversion) , not just convertibility. However, doing so 
requires a large number of features to be defined, with hair-splitting distinctions 
being made between them. It is not at all clear that the advantages are worth 
this cost. 

To build PAL + in the hierarchy, there are two possibilities. Firstly, we could 
write a feature that introduces classes of a-ary terms and kinds for every arity 
a, in a similar manner to A^, but the only such terms are the a-ary variables 
and constants. Then we could build on top of this a features similar to vdef 
and let, but allowing global and local definitions of any arity term and kind. 
Putting these three features on top of BF + LPar (w) + eq yields a framework 
equivalent to PAL + . 

Alternatively, we could build features similar to vdef and let on top of 
BF+LPar (w)+eq-|-AU, including a redundancy that identifies [x\ : Ai] ■ ■ ■ [x n : 
A n ]M with letu[a;i : A\, ... ,x n : A n \ = M : Ainu, where A is an inferred kind 
for M. 



5 Use of Frameworks 

Note that all the existing frameworks we have considered (with the notable 
exception of PAL) use either SPar (w) or LPar(w). This is natural if one is 
approaching frameworks from the point of view of the lambda calculus; these 
are the easiest features to define as (say) PTSs. However, it is overkill. For: 

Theorem 2. — The grammar of propositional logic, and Hilbert- style rules of 

inference, are representable in BF + SPar (1). 

— The grammar of predicate logic, and natural deduction-style rules of infer- 
ence, are representable in BF + SPar (2). 

— Martin-Ldf’s Theory of Sets is representable in BF + LPar (2) + eq. 

We only have space here to partially justify a few of these claims. We shall 
show how to build an arbitrary first-order theory in BF + SPar (2), and how 
W-types are built within BF + LPar (2) + eq. 

For a first-order theory in BF + SPar (2), the signature consists of: 

term : Type 

F : (x i : El (term), . . . , x n : El (term)) El (term) 

for each n-ary function symbol F in the language 
prop : Type 

P : (xi : El (term), . . . , x n : El (term)) El (prop) 

for each n-ary predicate symbol P in the language 
—t:(x: El(prop), y : El(prop)) El(prop) 
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V : (p : (x : El(term)) El(prop)) El(prop) 

Prf : (. x : El(prop))Type 

—> I : (p,q : El(prop), H : (x : El(Prf [p])) El (Prf [g])) El(Prf [— > [p, q}}) 

E :(p,q: El(prop), Hi : El (Prf \p,q]]), H 2 : El (Prf [p])) El(Prf [q]) 

VI : (p : (x : El(term)) El(prop), H : (x : El(term)) El(Prf [p[a;]])) El(Prf [V[p]]) 
ME : (p : (x : El(term)) El(prop), t : El(term), H : El(Prf[V[p]])) El(Prf[p[f]]) 

Theorem 3. 1. There is a bijection p between the terms with free variables 

among x \, ... ,x„ in the first-order language, and the terms M such that 

X\ : El(term), ... ,x n : El(term) \- s M : El(term) 

2. There is a bijection a between the formulas with free variables among x\, 
..., x n in the first-order language, and the terms M such that 

x\ : El(term), ... , x n : El(term) \~s M : El(prop) 

3. Let (j), ... , ip m be formulas with free variables among xi,... ,x n . Then 

4> is provable from hypothese ipi, ... , ifm iff there is a term M such that 

xi ■■ El(term), ... ,x n : El(term), j/i : El(Prf[er(^i)]), ... ,y m : El(Prf[cr(^ m )]) 

hu M : El(Prf [cr(<^)]) 

Notice that the correspondance between the entities of the object logic and 
the terms of the logical framework is a bijection up to identity (that is, a- 
conversion), not up to convertibility; indeed, in a framework whose only features 
are SPar (n) and LPar (n), there is no such thing as convertibility. This theorem 
is much easier to prove than most adequacy theorems, because the correspon- 
dence between the framework and the object logic is so much closer than in a 
traditional logical framework. 

We now show how to build W-types within BF + LPar (2) + eq. In the 
following, we shall suppress instances of El, and use ^-contractions; e.g. we write 
W[A,B] for W[A, [x : A\B[x]\. 

W : (A: Type, B : (A)Type)Type, 

sup : ( A : Type, B : (A)Type, a : A, b : (B[a])W[A, B])W[A, B\ 

E w : (A : Type, B : (A)Type, C : IW[A,B]) Type, 
f :{x: A, y:(B[x])W[A,B], 
g : (v : B[x])C[y[v]])C[sup[A,B,x,y\], 
z : W[A, B])C[z\, 

(A : Type, B : (A)Type, C : {W[A,B]) Type, 
f ■ (x: A, y : (B[x])W[A, B\, g : (v : B[x])C[y[v]])C[sup[A, B,x,y]\, 
a: A, b: (B[a\)W[A, B]) 

E w [A,B,C,f,g,swp[A,B,a,b]] = f[a,b, [v : B[x]]E w [A,B,C,f,g,y[v]]\ 

: C[sup[A, B, a , 6]] 
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6 Conclusion 

We have given a modular method for defining logical frameworks, and shown 
that it captures, up to a reasonable notion of equivalence, several existing logi- 
cal frameworks. It has revealed common subsystems between these frameworks 
that may not otherwise have been found — it is doubtful, for example, that 
one would have discovered the fact that there is a system BF + SPar (ui) which 
can be conservatively embedded in both ELF and Martin-Lof ’s Theory of Types 
without this work. It has revealed much weaker frameworks than we are ac- 
customed to using, that may prove advantageous for theoretical work, such as 
proving adequacy theorems. And, finally, it may yet provide a method for defin- 
ing features in a generic manner such that they can be added to any logical 
framework, and their properties studied independently of any framework. 



Future and Related Work 

The only work of a similar nature of which I am aware is the Tinkertype system 
[10]. There are striking similarities between the two systems. However, I be- 
lieve this work is different in character. Tinkertype’s features cannot be defined 
separately, and do not behave independently; they certainly do not always yield 
conservative extensions. While this would not be a desideratum for type systems 
in general, as with which Tinkertype deals, I believe it is important for logical 
frameworks. 

In the future, as well as the obvious matters of defining more features, captur- 
ing more aspects of logical frameworks, and exploring the properties of features 
independently of one another, it would be interesting to see if we could lay down 
general conditions C±, C 2 , . . . on features, and prove results such as: 

Any feature with conditions C \ , C 2 , • ■ • yields a conservative extension 
of any framework composed solely of features that satisfy conditions C\, 
C 2 ,... 

It would also be interesting to see if we could prove generalised adequacy results 
using the hierarchy, and give general definitions of semantics for an object theory 
and prove generalised soundness and completeness results. 
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Abstract. Conditions on type preorders are provided in order to characterize the 
induced filter models for the A-calculus and some of its restrictions. Besides, two 
examples are given of filter models in which not all the continuous functions are 
representable. 



1 Introduction 

The semantics of the A-calculus can be looked at from several points of view. A possible 
one considers a model as an abstract way of handling and dealing with the syntax. This 
is the point of view of those investigations looking for extensions of the A-calculus such 
that the intended semantical domain turns out to be fully abstract w.r.t. the calculus. 

From another point of view, instead, the semantics is seen mainly as a tool to con- 
firm one’s “syntactic intuitions” and to prove properties of the calculus. According to 
this latter viewpoint, “semantically oriented” extensions of a calculus are not always 
commendable. The focus is on the calculus: the model has to fit as tight as possible the 
calculus, not vice versa. This is indeed the point of view of the present paper, and, in 
general, the one of an investigation we are carrying on, started in a companion paper [3]. 
In such a research we try to devise a general setting and uniform tools to “tailor” models 
closely fitting as many as possible aspects of the computational paradigm embodied by 
the A-calculus. 

One of the most natural framework for such an investigation is the typing discipline 
with Intersection Types. Intersection type assignment systems allow to characterize many 
of the most important denotational (as well as operational) properties of A-terms. In 
particular it is possible to describe, in a natural and finitary way, many semantic domains 
for the A-calculus. Such finitary descriptions allow not only to analyze pre-existing 
models, but also to modify them, sometimes “tailoring” them according to one’s needs 
(see [6,10,14,18,17,22,5,12] and the references there.) 

Finitary characterizations of models for the A-calculus, the so called filter models, can 
be obtained by simply introducing specific constants, typing rules and type preorders 

* Partially supported by MURST project NAPOLI. 

** Partially supported by EU within the FET - Global Computing initiative, project DART ST- 
2001-33477, and by MURST Cofin’02 project Me Tati. The funding bodies are not responsible 
for any use that might be made of the results presented here. 
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in a basic intersection type assignment system. An element of a particular domain, 
representing the denotational meaning of a term M, comes then out to correspond to the 
set of types that can be inferred for M. 

In [3] we have characterized those intersection type assignment systems aiming, in 
perspective, at providing Unitary descriptions of filter models validating in a precise 
way the notions of / 3 and 77 reduction and expansion for the whole A-calculus, as well as 
some of their restrictions, like (3 V [21], (3 - 1 [11] and /3-KN [17]. 

The present paper keeps on the same direction by proving a number of characteri- 
zation results for filter A-structures induced by type preorders. 

Since any type preorder can induce a particular filter A-structure, it is possible to “tailor” 
particular models by providing suitable conditions on the inducing type preorders. Our 
first “tailoring” result characterizes those type preorders inducing A-structures in which 
relevant sets of functions can be represented. A second result characterizes A-structures 
which are models of the whole A-calculus. In a third result we characterize those filter 
A-structures which are also models of the aforementioned restricted A-calculi: the call- 
by-value A-calculus, the Al-calculus, the AKN-calculus. The result is also extended to 
the extensional models. 

A further “tailoring” result of the present paper concerns the possibility of “trim- 
ming” something that is usually overabundant in filter models: the set of the representable 
functions. Such a task is not a trivial one in the intersection filters setting. In fact in any 
filter model introduced in the literature, but the one in [ 8 ], any continuous function is 
representable. Our contribution to this task is the construction of type preorders inducing 
filter models of the whole A-calculus in which not all continuous functions are repre- 
sentable. The proofs of this property will profit from the characterization results of the 
paper. 

We shall assume the reader to be acquainted with the main concepts concerning the 
A-calculus and its models. The paper will be structured as follows: in Section 2 we recall 
the notions of intersection type language, type preorder and type assignment system, 
while the definitions of filter A-structure and filter model will be recalled in Section 3. 
The four characterization results will form the subject of Section 4. In Section 5 we shall 
define two particular preorders in whose induced filter models only a proper subset of 
the continuous functions is representable. 

2 Intersection Types Languages and Type Assignments 

Intersection types, the building blocks for the filter models, are syntactical objects built 
by closing a given set (L of type atoms (constants) under th e function type constructor 
— ► and the intersection type constructor (T. 

Definition 1 (Intersection type language). The intersection type language over (E, 
denoted by IT = TT((E) is defined by the following abstract syntax: 

T = (C j T— | T (T T. 

Much of the expressive power of intersection type languages comes from the fact 
that they are endowed with a preorder relation, <, which induces, on the set of types, 
the structure of a meet semi-lattice with respect to fl. 
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Definition 2 (Intersection type preorder). An intersection type preorder is a pair 
(<C , <) where (E is a set of type constants and < is a binary relation over IT = T((E) 
satisfying the following set of axioms and rules: 



(refl) A < A 



(incl L ) 

( mon ) 

m 



AnB < A 
A< A! B < B' 
A n b < A' n B' 
iffl&(L A<fl 



( idem ) A < A fl A 
(uicIr) A fl B < B 

A< B B < C 



(trans) 



A< C 

(v) ifv£(L A-^B < v 



Notation. - E will be short for (<L, <). 

- A ~ B will be short for A < B < A. 

- Since fl is commutative and associative (modulo ~), we shall write P| i < A,- for 
A\ fl . . . fl A n . Similarly we shall write fl ie/Aj, where I denotes always a finite set. 
Moreover we make the convention that fl i^Ai is fl when fl e(C. 

- We shall denote by <v the type preorder generated by a recursive set V of axioms and 
rules of the shape A < B (where V it is said to generate < if A < B holds if and only 
if it can be derived from the axioms and rules of V together with those in Definition 2). 
The constants in V will be denoted by (E v . 

- When we consider an intersection type preorder of the form ((E v , <v), we shall write 
T v and E v for T((E V ) and ((E v ,v ), respectively. 

- A-yB will be short for A^B^A. 

- We write “the type preorder E validates V” to mean that all axioms and rules of V are 
admissible. 1 

Figure 1 lists a few special purpose axioms and rules which have been considered in 
the literature. Their meaning can be grasped if we consider types to denote subsets of a 
domain of discourse and we look at — >• as the function space constructor in the light of 
Curry-Scott semantics, see [23]. 



(fl-rj) fl < fl-tfl 


(— F-n) 


(A—>B) n (A-+C) < A-+B n c 


(fl-lazy) A— >B < fl—*Q 


(v) 


A' < A B <B' 


A-^B < A'^rB' 



Fig. 1. Possible Axioms and Rules concerning <. 

We can introduce now four significant intersection type preorders which have been 
extensively considered in the literature. The order is logical, rather than historical, and 
the references define the corresponding filter models: [9], [14], [1], [6]. A richer list of 
type preorders can be found in [3], These preorders are of the form E v = ((E v ,<v), 
with various different names V, picked for mnemonic reasons. In Figure 2 we list their 
sets of constants (E v and their sets V of extra axioms and rules taken from Figure 1 . 
Flere (E XJ is an infinite set of fresh atoms (i.e. different from fl, v). 

1 Recall that a rule is admissible in a system if, for each instance of the rule, if its premises are 
derivable in the system then so is its conclusion. 
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c cr>v 


= (Coo 


CW 


= {(-»-n), (??)} 


(pFHR 


= M 


sun 


= CWU{(u)} 


<l ao 


= {D} 


AO 


= CWD{{n),(n-lazy)} 


(L BCV 


= {D} U (Coo 


BCD 


= CW U {(D), (i?-?7)} 



Fig. 2. Particular Atoms, Axioms and Rules. 



2.1 Particular Classes of Type Preorders 

In this subsection we introduce important classes of type preorders. The first two are 
the classes of natural type preorders and of strict natural type preorders. These are 
disjoint classes, whose relevance lies in their allowing various characterizations in terms 
of approximable mappings and A-structures. 

Definition 3 ((Strict) Natural type preorders). 

Let £ = (<E, <) be a type preorder. 

(i) £ is strict natural if flffL. and it validates CW as defined in Figure 2. 

(ii) £ is natural if Qe(L I and it validates AO as defined in Figure 2. 

Naturality for type preorders has a strong semantic flavour. If we look at intersection as 
representing join and at arrow types as representing functions, then rule (— >-(~l) reflects 
the join property of step functions with the same antecedent (( d => e) U (d => e') A 
d => (e U e')), rule ( 77 ) reflects the order relation between step functions ( d ' C d and 
e C e' imply d => e C d' => e'), and rule ( fi-lazy ) reflects the fact that => J_ is the 
bottom function. 

Among the type preorders of Figure 2, CW, £Hll are strict natural, and AO, BCD 
are natural. 

Notice that by the implicit assumption that axiom (u)e£ whenever i/(_(L ' (Defini- 
tion 2) a strict natural type theory containing the constant v validates CT-LTZ. 

We introduce two other interesting classes of preorders playing a crucial role in the 
characterization results of Section 4. 

Definition 4 (Beta and eta preorders). 

Let £ = (QZ. <) be a type preorder and T = T((E). 

(i) £ is beta iff for all I, A i; Bi , C , De T: 

< C-fD => P| jgjBj A D where J = {iEl | C < Aj}. 

(ii) £ is eta iff for all tpE(L at least one of the following conditions hold: 

1) v<if>; 

2) there exist /, A;, B ( G~1T such that P| ig/ (Aj— >Bi)<if) and Bi~f2 for all iEl; 

3) there exist non empty families of types {Aj,5,} ig /, {D^j, j}j£ in T such 
that 



r\ lGl ( A WB t )<ip < Plig/d & 
VfGl. A,; < PljeJi D t ,j & OjeJi i,j — ■ 
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The condition for a natural type theory of being beta reflects the criterion used to 
establish if a sups of step functions is greater than a step function (see [15]). 

When S = X' v it is usually possible to prove the conditions defined above by 
induction on the derivation of judgments. For the type preorders of Figure 2 we get that 
they are all beta and SHTl, AO are eta. 



2.2 Intersection Type Assignments 

We introduce now the notion of intersection type assignment system. First we need a 
few preliminary definitions. Let Var denote the set of term variables. 

Definition 5 (Type assignment system). 

(i) A basis over (L is a set of statements of the shape x:B, where the subjects x are 
in Var , the predicates B are in TT((E), and all subjects are distinct variables. 

(ii) An intersection-type assignment system relative to S = ((£.,<), denoted by AC\ S , 
is a formal system for deriving judgments of the form r \~ s M : A, where the subject 
M is an untyped A -term, the predicate A is in T((E), and r is a basis over (L. 



Notation. We shall write: 

- xGT as short for (x : A)€T for some A ; 

- r, x:A as short for r U {x:A}, proviso x^T . 

We use fctl to denote the union between bases defined by: 

A W A = {(a;:T) | (:r:T)eA&£^A} U 
{(at:r) j (:t:t)€A&£$Ai} U 
{(z:ti (T r 2 ) | {x:Ti)Gr 1 k(x:T 2 )er 2 } 

A term M is said to be typable in ACT 2 , for a given basis r, if there is a type AeT((L) 
such that the judgment r \~ s M : A is derivable. 



Various type assignment systems can be defined, each of them parametrized w.r.t a 
particular E=((L, <). The simplest system is given in the following definition. 



Definition 6 (Basic type assignment system). 

Given a type preorder S, the axioms and rules of the basic type assignment system, 
denoted by Aflg , for deriving judgments r hg Ad : A, are the following: 



(At) r Vg x\A if(x:A)&r 

r,x:A hg M : B r hf M : A -> B A hg N : A 

1 r\-§ A x.M : A-^B 1 ' r \-§ MN : B 

r hg M : A r hf M : B r hf M : A A < B 

^ ^ r \-§ M : A n B ^ r\-§ M:B 



If 17 € (C, a natural choice is to set 17 as the universal type of all A-terms. This amounts 
to modify the basic type assignment system by adding a suitable axiom for 17. 
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Definition 7 (17-type assignment system). 

Given a type preorder £ = ((E, <) with 17e(E, the axioms and rules of the 17-type 
assignment system ( denoted \C\f 2 ),for deriving judgments of the form r \~q AT : A, 
are those of the basic one, plus the further axiom 

(. Ax-fi ) r b£ M : Q. 

Analogously to the case of 17, when i/£(£, it is natural to consider v as the universal 
type for abstractions, hence modifying the basic system by the addition of a special 
axiom for v. 

Definition 8 (V-type assignment system). 

Given a type preorder £ = ((£,<) with u£<£, the axioms and rules of the zy-type 
assignment system (denoted XC\ E ), for deriving judgements of the form r b E M : A, 
are those of the basic one, plus the further axiom 

(Ax-v) r h E A x.M : v. 

For simplicity we assume the symbols 17 and v to be reserved for the universal type 
constants respectively used in the systems XP\f 2 and AflJ, i.e. we forbid 17 g(E or u£(L 
when we deal with Aflg . 

Notation. - AfT 5 will range over Aflg , Afl^ and AflJ. More precisely we assume that 
AfW stands for Afl^ whenever 17 g(E, for Afljf whenever z/£(E, and for Aflg otherwise. 
Similarly for b E . 

- When £ = £ v we shall denote AfT s and \- s by An v and by, respectively. 



It is easy to prove that the following rules are admissible in AfT 2 . 



(W) 


r \- E M : A x(fr 


(C) 


R x:B b E M : A r h E N : B 


r,x:B b E M : A 


r \- E M[x N] : A 


(S) 


r, x:B \- s M : A x(£F V(M) 


(< L) 


r, x:B b E M : A C <B 


r \~ s M : A 


r, x:C \~ E M : A 



As usual a generation lemma is handy: its proof can be found in [4], 

Lemma 1 (Generation lemma). 

Let £ = (CC, <) be a type preorder and IT = TT((E). 

(i) Assume Af-fl. Then r \~ E x : A iff (x:B)£r and B < A for some B£ T. 

(ii) Assume A / 17. Then r MN : A iff B \- s M : B, CJ t , r \~ E N : B„ 

and Hig/ C* < A for some I and B 2 , CjbT. 

(iii) Assume v f. A. Then /’ \- E A x.M : A iff r,x:B 2 AT : Ci, and (~) ie[ (B,j — > 
Ci) < A for some I and B 2 , CjbT. 



3 Filter A-Structures and (Restricted) Filter Models 

It is possible to use intersection types for building models for A-calculus and some of 
its restrictions. Let us first recall the general notion of restricted A-calculus. 
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Definition 9 (Restricted A-calculus). Let R C {((Xx.M)N, M[x := NJ) 

M, NeA}. The restricted A-calculus Ar is the calculus obtained from the standard 
X-calculus by restricting the (3- rule to the redexes in R. (called f3-M.-redexes). 

Next definition of ( restricted ) model is a generalization of the classical notion of model 
for the untyped A-calculus of Hindley-Longo (see [16]). 

Definition 10 ((Restricted) models ). A model for the (restricted) A-calculus Ar con- 
sists of a triple (V, ■, [ ]’ D ) such that V is a set, • : V x V — > V, Env : Var — > V for 
some VCD and the interpretation function [ ] D : A x Env —> T> satisfies: 

(i) Mp = p{x); 

(ii) [MN]f = I M]? ■ [N]f: 

(hi) [A z.M]® • [IV]® = [A % x:HNjf] f°r <(A x.M)N,M[x := iV]) G M; 

(iv) If p(x) = p' (x) for all x G FV (M), then [M]® = [M]®; 

(v) Ify0V(M), then [A x.Mjf = [A y.M[x := y]]®; 

(vi) If\/deV.\M\V [x . =d j = lN}V [x:=dl , then [A x-Mj” = [A x.Njf. 

(D, •, [ J®) is extensional if moreover 

[A x.Mx]” = [M]®. 

We can devise semantics domains out of intersection types by means of an appropriate 
notion of filter over a type preorder. This is a particular case of filter over a generic 
T -meet semi-lattice (see [19]). 

Definition 11 (A-filters). Let S = ((£.<) be a type preorder and T = T((E). A 
E-filter ( or a filter over T) is a set ICT such that 

(i) if LIeQ . I then QeX; 

(ii) if A < B and AeX, then BeX; 

(iii) if A, BeX, then A n BeX. 
denotes the set of E -filters. 

Notation. Given XCT,fX denotes the 17-filter generated by X. For AeT, we 
write f A instead of t{^}- 

It is possible to turn the space of filters into an applicative structure in which to 
interpret A-terms. Assuming the Stone duality viewpoint, the interpretation of terms 
coincides with the sets of types which are deducible for them. 

Definition 12 (Filter structures). 

(i) Application _ • _ : X s x is defined as 

X-Y=f{B | 3AEY.A -> BeX}. 

(ii) For any X-term M and environment p : V ar — > \ {0}, 

mf = {A\3r}=p.rh s M : A}, 

where r |= p if and only if (x : B)eT implies BEp(x). 

(iii) A filter A-structure is a triple , •, [ ]“). 
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By rules (1?), (<) and (fll) the interpretations of all A-terms are filters. 

Thanks to the following theorem, it is sufficient that clause (iii) of Definition 10 holds 
in order a filter A-structure {J- s , *, [ ]p) be also a model for the restricted A-calculus 
Ar (called filter model for Ar). 

Theorem 1. For all type p reorders £ the interpretation function | ] E satisfies conditions 
(i), (ii), (iv), (v), (vi) of Definition 10. 

Proof. We only consider the interesting cases. 

(ii) Let Ag\MN\ e . The case A ~ fl is trivial. Otherwise there exists r (= p such 
that r \ }: AIN : A. By Lemma 1 (ii) there exist I and B, , C , eT such that for all iGl, 
r \~ E AI : Bi —y Ci, r \- E N : Bi, and flier C, < A. From the first two judgments 
above, we get BiG{Nj E and Bi C t G \M 0 ' . By definition of application it follows 

Let now Ag\M\ e • [W]^. Then there exist I , Bi, CiG T such that fj i£j C) < A and 
for any iGl, Bi — > C, G [M ] E and BiG\N\ E , hence there exist two bases over (E, T) 
and F', such that F, \= p, F' [= p, and moreover T) AI : B, —$■ Ci, T' \~ E N : Bi. 
Consider the basis T* = l S ie j(r t 1+1 r'). We have T* |= p, F* \- E AI : B t —> Ci 
and r* \- E N : Bi. From the last two judgments, by applying (— »E), we deduce 
F* \- s AIN : Ci, which implies, by (fll) and(<),P* \~ s MN : Al, hence Ale[M./V]^. 

(vi) Suppose that the premise holds and Ag\Xx.AF\ e . The case v < A is trivial. 
Otherwise there is T | = p such that F h 17 Xx.AI : A. Since x0'N(Xx.AI) by rule (S) 
we can assume x(fF . By Lemma l(iii) there exist I and B, . CiG T such that, for each 
iGl, r,x : Bi h r AI : Ci, and f] ieI (Bi — > Ci) < A. By the premise, we get, for each 
iGl, r,x : Bi \- E N : Ci, hence by (—>1) and (<) we get r Xx.N : A, which 
implies [A x.M\ E C [Acc.Af]^. Similarly one proves [Acc.iV]^ C [A x.AI\ E . 

Corollary 1 ((Restricted) filter models). A filter X-structure ifF yz , •, [ ] s ) is a filter 
model for the restricted X-calcuIus Ar iff for any redex (Xx.AI)Ng'M., environment p, 

[(A x.M)N}f = , that is, 

(t|) 3r\= p.T (Xx.AI)N : A 3F' (= p. C h E AI[x := N] : A. 



4 Four Characterization Results 

The first two characterization results we give concern natural type preorders. We begin 
studying the representability of interesting classes of (strict) Scott continuous functions. 
These characterizations generalise those given in [8]. 

Definition 13 (Representabie functions). Given a type preorder £, a function f : 
J- e —g!F e is said to be representable in £ if it is representable in the induced filter 
X-structure (!F E , •, [ J^ 7 ), that is there exists XgT s such that for any Y GT E , X ■ Y = 

f(Y). 

Next lemma is useful for characterizing the sets of representable functions. 
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Lemma 2. Let £ = (<E, <) be a natural type preorder and f : — > X s be a 

continuous function. Then f is representable iff for all I and Aj, Bi, C, -DgTT((E), with 
D 7 ^ 17, it holds 

(b) (Vie/.Bi€/(t^i)) & no. ->Bi)<C^D =► DefifC). 

i£l 

Proof. (4=) Let Xf =f {A — > B \ Befit A)} We prove that for any GAT ((E) we 
have f(fC) = X f - fC. 

X- fC = f{D \ 3C' > C.C' — > DeX} by definition of application 

= f{D\C ^ D&X} by ( 77 ) 

= {D | C DeX} by (->-n) 

= {D | 3J, A i ,B i .(Vie/..B < e/(tA j )& 

n ieJ (A ► Bf) <C—t D} by definition of Xf 

= {D I DefitC)} by (b) 

= mc). 

(=>) Suppose by contradiction that there exist I, A , . B, . C , D (with D / 17), such that 
n ieJ (A -F Bf) < C — » Zb, and for any ie/, but Zb^/(tC). If X is any 

filter candidate to represent /, then, for any ie/, 13, G X ■ f Aj, which implies, by easy 
computations, A, IfGX , for any iel. Since P| ig/ (Aj — >• Bf) < C — > D, it follows 
C DeX, hence DeX- fC, making it impossible that X represents /. 



Theorem 2 (Characterization of sets of representable functions). 

(i) The set of functions representable in a strict natural preorder £ = ((E, <) con- 
tains: 

1) the step function _L =>• _L; 

2) the strict step functions iff A— >B < C^tD imply C < A, B < D for all 
A,B,C,DeT{(£); 

3) the strict continuous functions iff £ is a beta preorder. 

(ii) The set of functions representable in a natural preorder £ = ((E, <) contains: 

1) the step function _L => _L iff A—>B ~ 17 implies B ~ 17; 

2) the constant functions iff 17— < C— »Zb implies B < D for all B,C,D in 

T((E); 

3) the step functions iff A—>B < C^tD and D 17 imply C < A, B < D for all 
A,B,C,De T((E); 

4) the continuous functions iff £ is a beta preorder. 

Proof, (sketch) For each point above, the theses follow applying condition (b) of Lemma 
2 to the class of functions involved, taking into account that: 

- De(tA=>t-B)(tC) iff C < A and B < D\ 

- Befit A) iff t A B o /; 

where f A B is the step function from f A to t B, f is a continuous function and C 
is the point-wise ordering. 
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All the type theories of Figure 2 are beta. Moreover, the type preorders AO, BCD 
iCVV , ET-CIZ) are (strict) natural and therefore, by Theorem 2(ii4), in all the filter A- 
structures induced by such preorders all (strict) continuous functions are representable. 

Our second characterization result on natural type preorders consists in giving a 
criterion for selecting those type preorders whose induced filter A-structures are indeed 
filter models of the whole A-calculus. To do that we use a result of [20], in which an 
applicative structure is showed to be a model provided that it contains the combinators 
K, S and e. Thus, a condition for having a filter model can be obtained by simply forcing 
the existence of such combinators. 

Theorem 3 (Characterization of model-inducing preorders). 

Let £ = ((E, <) be a natural type preorder. The filter X-structure (F s , ■, [ ] s ) is a filter 
model of the whole X-calculus iff the following three conditions are fulfilled. 

(i) (existence of K) 

VC', D, E 31, Ai, Bi. P| — > Bi — > Ai ) < C —¥ D — >• E 

C < E: 

(ii) ( existence of S ) 

V.D, E, F, G 31, Ai, B^ Cj. P ie j((Aj — > Bi — > Cf) —t (Aj —t Bf) — > A , Cp 

<D^E^F^G ^ 

3H.E <F -a H and D < F -a H -a G; 

(iii) (existence of e) 

VC, D 31 , Ai, Bi. P ieJ ((Aj — > Bf) — > Ai — > Bi) < C — > D 
3J,E j ,F j .C<r\ jeJ (E j ^F j )<D. 

The filter X-structure (F s , ■, [ ] s ) is an extensional model if the third condition above 
is replaced by: 

(iii 7 ) VA 31, Ai, B,.A ~ P ie/ (Aj — > Bi). 

These conditions are obtained by considering the application of combinators to filters. 
Similarly, one could characterize the representability of an arbitrary combinator of the 
shape A./'i . . . x n .C. where C is a combination of variable (that is, it does not contain any 
A-abstraction). 

Our third result characterizes those type preorders inducing filter models for the 
main restricted A-calculi studied in the literature, namely the Al-calculus [11], the AKN- 
calculus [17] and the call-by-value A-calculus [21]. The redexes of these calculi are 
defined as follows. 

Definition 14 (Restricted redexes). ([21,11,17]) 

(i) A redex (Xx.M)N is a /3„-redex if N is a variable or an abstraction. 

(ii) A redex (Xx.M)N is a /3-I-redex ifxGFV(M). 

(iii) A redex (Xx.M)N is a /3-KN-redex if it is a f3-I-redex or N is either a variable 
or a closed strongly normalising term. 

Before characterizing (restricted) filter models we need a technical result on typing 
properties of strongly normalizing terms: for a proof see [13]. 




Tailoring Filter Models 



27 



Proposition 1 (Characterization of strongly normalizing terms). A A -term M is 
strongly normalizing iff for any type preorder £ = ((E , <) there exists Ag T ( (E ) and a 
basis r over (E such that r h 27 M : A. 

Let £ = ((E, <) be a type preorder: we say that a basis E* over (E is a (£, E, M )- 
basis iff the subjects of E* are the variables which occur free in M and are not subjects 
of r, i.e. {x€r*} = {xGFV(M) | x(£r}. 

Theorem 4 (Characterizations of (restricted) filter models). Let £ = ((E, <) be a 

type preorder and 3(17 , M . x) 2 be short for: 

WTbasis over (E VE*(A7, E, M)-basis VA, _BgT((E). 

E h 27 A x.M : B^A => r*,T,x:B h 27 M : A. 

The filter X-structure (E 27 , •, [ J 27 ): 

(i) is a model of the call-by-value X-calculus iff for any M, x, p: 

1) |A x.M\^ 0 and 

2) 3(27, M,x) holds; 

(ii) is a model of the Xl-calculus iff for any M, x, N, p such that atGFV (M): 

1) \M[x := TV]]^ 7 ^ 0 implies [iV]^ 7 ^ 0 and 

2) 3(A7, M,x) holds; 

(iii) is a model of the XKN-calculus iff it is a model of the Xl-calculus; 

(iv) is a model of the whole X-calculus iff for any M, p: 

1) {Mjf ^ 0 and 

2) A(£, M, x) holds. 

Proof. We prove with details point (i) and give hints for the other points. 

(=>) Let (E 17 , •, [ J 17 } be a model of the call-by-value calculus. Assume by contra- 
diction that lXx.M 0 ]f 0 = 0 for some M 0 , x, po : this implies l(Xzy.y)(Xx.M 0 )}f 0 = 
lXzy:y]p a ■ [Ax.Mo]^ = 0 . Since (Xzy.y)(Xx.M 0 ) is a (3 V - redex which reduces to 
A y.y and A AG\Xy:y\^ for all p this contradicts («t=) of condition ([]) in Corollary 
1. Now we prove that 3(A7, M, x) holds for any M, x. Let E L 17 A x.M : B — > A. Then 
r' L 27 (Xx.M)x : A, where E' = E, x:B. Let p be an environment such that p(y) =t C 
if y:C&r' . It is easy to check that E' |= p. Since (A x.M)x is a (3 V - redex, condition 
(t]) of Corollary 1 holds, hence there exists a basis E" such that E" |= p and moreover 
r" h 27 M : A. By definition of (=, we have that for any variable y, if y.DGl " and 
y.CGr' then C < D. Applying rules (S) and (< L) we obtain E*, E, x:B L 27 AI : A 
where E* = {z : EgE" | z€FV(M) k z$r} is a (£,T,M)- basis. Notice that the 
predicates in E* must vary according to the environment p , and that p by construction 
can assign arbitrary filters to the subjects of E*. The proof of (=>) is so complete. 

(<t=) First we show (=>) of condition (t]) in Corollary 1. From E h 27 (Xx.M)N : A 
by Lemma 1 (ii) and 3(A7, M, x) we get E* , E, a: : E.j L 27 M : Ci, E h 27 N : B t , and 
fj i6 / Ci < A for all (£. E, M)-basis E* and for some I, Bi, C',gT. Then E*, E h 
M[x := iV] : C, follows by rules (C) and (W), and so E*, E L 27 M[x := N] : A using 
rules (HI) and (<). We conclude observing that we can choose E* such that E* |= p. 

2 Notice that (£, E, A/)-bases in 3(27, M, x) are useful only for AD^ 7 , since Aflg and APl^ 
enjoy the sub-formula property. 
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As to (<t=) of (t]) let D be a deduction of r b M[x := N] : A and L) b TV : B t for 
iGl be all the statements in D whose subject is TV. Without loss of generality we can 
assume that x does not occur in /’. 

If I is non-empty, notice that F C r t but r f FV(TV) = /j : f FV(TV) (by r \ X we 
denote {x : AgT \ xG X}). So using rules (S) and (Dl), we have that r b TV : fj ig/ _B,;. 
Moreover, one can easily see, by induction on M, that r, x : fj,.-- / f?i b M : A. Thus, 
by rule (—FI), we have r b A x.M : fj ig/ B^A. Hence, by (— FE) we can conclude 
r b (A x.M)N : A. 

If I is empty, we get from D a derivation of r b M : A by replacing each N by x. Two 
cases have to be considered: 

- if TV is a variable, say x, then by Definition 12(ii) p(x) 7 / 0; 

- if TV is an abstraction, then [TV]^ ^ 0 by hypothesis. 

In both cases there is a basis F' |= p, such that / ' b^' TV : B for some type B. By rule 
(W) we get r, x : B b M : A and we can conclude r ttl {a: : B} l+l r' b (Xx.M)N : A. 

As to the proofs of the other points, proceed as in the previous case taking into 
account for point (iii) that if TV is a closed strongly normalizing term, by Proposition 1 
it is typable in all intersection type systems from the empty basis. 



Notice that f2e(E or vGffL implies condition (il) of Theorem 4, (}(_(£. or v <f(L 
implies condition (iil) of Theorem 4 and FI g(L implies condition (ivl) of Theorem 4. 

The characterization of filter models can be extended to encompass extensionality. To 
this aim it is useful to know when typing is invariant under ^-expansion and //-reduction. 
Let 



(rj-exp) 



M TV r b TV : A 
r b M : A 



( rj-red ) 



M — ^ TV r b M : A 
Fb TV : A 



Next proposition corresponds to Theorem 4.5 of [3], 



Proposition 2 (Characterization of subject 77 -reduction/expansion). 

(i) Rule (p-exp) is admissible in Afl 17 iff E is eta; 

(ii) Rule (rj-red) is admissible in Aflg iff E validates CW, in An f 2 iff E validates 
CW U {(f2-rj)}, and it is never admissible in A . 

Theorem 5 (Characterization of extensional (restricted) filter models). 

Let E = ((E, <) be a type preorder. The filter X-structure , ■, [ J 1 ") is a extensional 
filter model of the restricted X-calculus Ar iff it is a model of Ar, E is an eta type 
preorder which validates CW, and moreover if 12g(E then E validates axiom ( Fl-p), if 
then for all M, p. 

Proof. (=>) Let p(_(F be a constant that does not satisfy all the conditions in Definition 
4(ii). One can show that .xy\^ x ._^ : this implies that E must be eta. 

We have A — > BAC&\Xy.xy\^ x ._^ A ^ B ^ A ^ c ^ for all A, B, C, but A -A BnCG t 
(A — >• B) n (A — » C) only if E validates axiom (— M~l). Similarly one can show that E 
must validate axiom (rj) and axiom (12-? 7 ) when fi£ (E. Lastly if oG (E then vG \Xx.M x] ^ 
for all M, p by axiom (A x-u) implies vG [M]^ for all M, p. 

(4=) follows from Proposition 2, but the case vG(L, in which v is harmless being 
contained in the interpretations of all terms. 
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Using the previous theorems we get: (/ rV , •, [ ] v ) with Ve{CX>V} is a model of the 
Al-calculus, with 

nabla = ETTIZ is a model of the call-by-value A-calculus, with Vg{AO, BCD} is a 
model of the whole A-calculus. 



5 Trimmed Filter Models 

In this section we provide actual examples of filter models of the whole A-calculus where 
not all continuous functions are representable. In particular we devise two type preorders 
which are natural but not beta. This implies, by Theorem 2(ii4), that some continuous 
function cannot be represented in the filter A-structures induced by such preorders. For 
the first model we acknowledge the adaptation of an idea in [8]. 

Definition 15 (Trimmed models). A (strict) filter model will be called trimmed if not 
all the (strict) continuous functions are representable in it. 

The Type Preorder 

Definition 16 (Z"^). Let (L = [ff f), < v ) }. The type preorder is the preorder 
induced by the set of rules <0> = BCD U {(<C>-A?)}, where 

(0-<?M< A[0 := <?]. 

In order to show that induces a trimmed filter model we need a few technical 
results. 

Lemma 3. (i) A <$ B implies A[<0> := A?] <<> f?[<0> := ( v ) ]; 

(ii) r M : A implies := A?] M : A[0 := 9?]; 

(iii) r, r' \-<> M : A implies L. r'[<> := V] M : A[C> := <?]; 

(iv) Vie/. r,x : Ai M : B h and f] ieI (Ai^Bi) <$ f| je A G J^ D j) imply 
VjeJ. r, x : Cj M : D r 

Proof, (i) By induction on the definition of 

(ii) By induction on derivations using (i) for rule (<<>). 

(iii) From (ii) and the admissible rule (<<>L), taking into account that if (x : IL)GJ\ 
then (x : B [( > := ^])eU[<0 := A?] and B <<> B[f) := <?]. 

(iv) We shall denote by (possibly with indexes) elements of (LA We show by 
induction on the definition of <<> that 

(D ^ (D h^H^h) — <> (C\jEj(Cj (T (Dfceif £k) 

and Vie/. T. x : Ai M : Bi imply V)G J. L. x : Cj M : Dj. 

The only interesting case is when the applied rule is (<0>-A?), i.e. we have 

f^(Ai^Bi) n ( P| ifh ) <<> FI ( i ph))^ '■= ^]- 

ie/ h£H i£l h£H 

By hypothesis B, x : At M : B u so we are done by (iii). 
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Theorem 6. The natural type preorder is not beta, but , ■, [ ]^) is a filter model 
of the whole A-calcuIus. 

Proof. A counter-example to the condition of Definition 4(i) is <0>— >-<0> <o E— since 

^o<>- 

To show that (tF^, •, [ ]^) is a model of the whole A-calculus, it suffices, by Theorem 
4(iv), to verify that M, x) holds for any M, x. By Lemma l(iii) T A x.M : 

A—>B implies r, x : Ci M : D, for some J, C'i, D t such that C\ i€ j(Ci— >Df) 
A—>B. So, we are done by Lemma 3(iv). 

The step function f <C> =>t 0 is an instance of non-representable function in . 

The Type Preorder £* 

Definition 17 (The mapping p). Let (E* = {17, 4. ♦} The mapping p : T*— is 
inductively defined as follows: 

p(E) = 17; p(*) = 17; p(*) = X 

P (A-+B) = A-^p(B); 

p(AnB) = p(A)np(5). 



Definition 18 (£*). E* is the type preorder induced by the set of rules 4 k = BCD U 
{(♦-4), (♦-♦—»)} where: 

(4-*) A < p (A); 

— >) A^-B < p(A)— >p(.B). 

Given a basis T, let pf /'j be the basis obtained by substituting any judgment x:A of /’ 
by x:p(A). 

We show that E* induces a trimmed filter model. As in the case of E' ? , we prove 
some technical results in order to show that M, x)) holds for any M, x. 

Lemma 4. (i) The mapping p is idempotent, i.e. p(p(A)) = p(A). 

(ii) A-ep(B) p(A)^p(B); 

(iii) A B implies p(A) P (B); 

(iv) r M : A implies p(L) M : p(A); 

(v) T, r' h* M : A implies T, p(r') h* M : p(A); 

(vi) Vie/, r, x : Ai L* M : B t and fj i£ AAi-tBf) <* fl \jeA G J^ D j) imply 
VjeJ. r, x : Cj L* M : D r 



Proof, (i) Easy. 

(ii) We get p(A) — > p(B) A — > *p(B) by axiom and rule (?y). Moreover 

A -A p (B) <* p(A) -A p(p(S)) = p(A) -A p(B) by axiom (4|t-& — ►) and (i). 

(iii) By induction on the definition of <+ using (i) and (ii). 

(iv) By induction on derivations using (iii) for rule (<*). We give the details just for 
rule (— > E). Suppose M = NL and /h* IV: B^rA, T h* L : B. Then, by induction, 
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p(r) b* N : p(B^A), p(r) b* L : p (B). Since p (B^A) = B->p(A), by (ii) we 
get p(r) b* NL : p(A). 

(v) From (iv) and the admissible rale (<+ L), taking into account that if (x : IT)Gf, 
then (a; : p(/?))ep(.r) and B <+ p (B). 

(vi) We shall denote by ip, £ (possibly with indexes) elements of (E*.We show by 
induction on the definition of that 

(D ^*)) ^ (H heH^h) (fl jejiCj—t'Dj)) (~l (flfceif £fc) 

and Vis/. -T, x : A; b* M : Bi imply VjS J. F, x : Cj b* M : Dj. 

The only interesting case is when the applied rules are (4k-4k) or — >■), i.e. we have 

A B < 4 p(A B) = A p (B) or A-^B < ^ p(A)— >-p(S). By hypothesis 
r , a ::A b^ M : B, so we are done by (v). 

Theorem 7. The natural type preorder L * is not beta, but (7+, •,[]*} is a filter model 
of the whole X-calculus. 

Proof. As in the proof of Theorem 6, Lemma 4( vi) allows to prove that for any M, 
x, 3(17*, M, x) holds, hence we can apply Theorem 4(iv) in order to conclude that 
■,[]*> is a model. On the other hand 4|fe is not a beta theory. For instance, ip^KU 
fl— > lo, but fl-f^tp. 

The step function f ♦ =>t A is an example of function not representable in 7F* . 
Actually 7F* is the inverse limit solution of the domain equation V ~ [D—fD] 
computed in the category of p-lattices (see [2]), whose objects are cu-algebraic lattices 
T> endowed with a finitary additive projection <5 : T>— >T> and whose morphisms / : 
(V, 6)—>(T>', S') are continuous functions such that S' o f C / o 6. 

6 Conclusions 

When stepping into the world of A-calculus semantics, intersection type systems turn out 
to be a useful “vehicle” to move around, since they provide a finitary way to describe and 
analyse particular classes of models. By simply adding a single constant or condition 
on a type preorder, a different semantical domain is characterized. One is then naturally 
induced to expect that intersection types will provide, in the long run, a sort of tailor 
shop in which particular domains can be tailored for any specific need. As a matter of 
fact, the possibility of pacing along this direction has been shown to be real also in [3], 
In the present paper we have made a step forward in this direction. Filter A-structures 
induced by intersection type preorders have been shown to provide models for the whole 
A-calculus and for a number of relevant “restricted” A-calculi when particular conditions 
on the type preorders are fulfilled. Even more, our proposed conditions provide precise 
characterizations for intersection type-induced models. 

When a model is produced, the second step is almost always to make it precisely fit 
the calculus, by “trimming” it and eliminating the exceeding parts. We have shown in the 
present paper that in the framework of intersection-induced models for the A-calculus 
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such a trimming is indeed possible, by providing two examples of filter models in which 
not all the continuous functions are representable. 

Much to do is left about model “tailoring”, like trying to see if many conditions 
on type preorders implicitly expressed in terms of generation properties of type 
assignment can be made explicit on the type preorders itself. Besides it would be 
interesting to check whether also the webbed models [7] allow for “tailoring operations”. 

Acknowledgments. The authors are grateful to Furio Honsell, Henk Barendregt and 
Wil Dekkers for enlightening discussions on the subject of the present paper. We wish 
also to thank the anonymous referees who pointed us interesting directions for further 
research and whose comments and suggestions have been very helpful to improve the 
presentation of the paper. 
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Abstract. Locales provide a module system for the Isabelle proof as- 
sistant. Recently, locales have been ported to the new Isar format for 
structured proofs. At the same time, they have been extended by locale 
expressions, a language for composing locale specifications, and by struc- 
tures, which provide syntax for algebraic structures. The present paper 
presents both and is suitable as a tutorial to locales in Isar, because it 
covers both basics and recent extensions, and contains many examples. 



1 Overview 

Locales are an extension of the Isabelle proof assistant. They provide support 
for modular reasoning. Locales were initially developed by Kammiiller [4] to sup- 
port reasoning in abstract algebra, but are applied also in other domains — for 
example, bytecode verification [5]. Kammuller’s original design, implemented in 
Isabelle99, provides, in addition to means for declaring locales, a set of ML func- 
tions that were used along with ML tactics in a proof. In the meantime, the input 
format for proof in Isabelle has changed and users write proof scripts in ML only 
rarely if at all. Two new proof styles are available, and can be used interchange- 
ably: linear proof scripts that closely resemble ML tactics, and the structured 
Isar proof language by Wenzel [8] . Subsequently, Wenzel re-implemented locales 
for the new proof format. The implementation, available with Isabelle2003, con- 
stitutes a complete re-design and exploits that both Isar and locales are based 
on the notion of context, and thus locales are seen as a natural extension of 
Isar. Nevertheless, locales can also be used with proof scripts: their use does not 
require a deep understanding of the structured Isar proof style. 

At the same time, Wenzel considerably extended locales. The most important 
addition are locale expressions, which allow to combine locales more freely. Pre- 
viously only linear inheritance was possible. Now locales support multiple inheri- 
tance through a normalisation algorithm. New are also structures, which provide 
special syntax for locale parameters that represent algebraic structures. 
Unfortunately, Wenzel provided only an implementation but hardly any docu- 
mentation. Besides providing documentation, the present paper is a high-level 
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description of locales, and in particular locale expressions. It is meant as a first 
step towards the semantics of locales, and also as a base for comparing locales 
with module concepts in other provers. It also constitutes the base for future 
extensions of locales in Isabelle. The description was derived mainly by experi- 
menting with locales and partially also by inspecting the code. 

The main contribution of the author of the present paper is the abstract de- 
scription of Wenzel’s version of locales, and in particular of the normalisation 
algorithm for locale expressions (see Section 4.2). Contributions to the imple- 
mentation are confined to bug fixes and to provisions that enable the use of 
locales with linear proof scripts. 

Concepts are introduced along with examples, so that the text can be used 
as tutorial. It is assumed that the reader is somewhat familiar with Isabelle 
proof scripts. Examples have been phrased as structured Isar proofs. However, 
in order to understand the key concepts, including locales expressions and their 
normalisation, detailed knowledge of Isabelle is not necessary. 



2 Locales: Beyond Proof Contexts 

In tactic-based provers the application of a sequence of proof tactics leads to a 
proof state. This state is usually hard to predict from looking at the tactic script, 
unless one replays the proof step-by-step. The structured proof language Isar is 
different. It is additionally based on proof contexts, which are directly visible in 
Isar scripts, and since tactic sequences tend to be short, this commonly leads to 
clearer proof scripts. 

Goals are stated with the theorem command. This is followed by a proof. When 
discharging a goal requires an elaborate argument (rather than the application 
of a single tactic) a new context may be entered (proof). Inside the context, 
variables may be fixed (fix) , assumptions made (assume) and intermediate goals 
stated (have) and proved. The assumptions must be dischargeable by premises 
of the surrounding goal, and once this goal has been proved (show) the proof 
context can be closed (qed). Contexts inherit from surrounding contexts, but it 
is not possible to export from them (with exception of the proved goal); they 
“disappear” after the closing qed. Facts may have attributes — for example, 
identifying them as default to the simplifier or classical reasoner. 

Locales extend proof contexts in various ways: 

— Locales are usually named. This makes them persistent. 

— Fixed variables may have syntax. 

— It is possible to add and export facts. 

— Locales can be combined and modified with locale expressions. 

The Locales facility extends the Isar language: it provides new ways of stating 
and managing facts, but it does not modify the language for proofs. Its purpose 
is to support writing modular proofs. 
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3 Simple Locales 

3.1 Syntax and Terminology 

The grammar of Isar is extended by commands for locales as shown in Figure 1. 
A key concept, introduced by Wenzel, is that locales are (internally) lists of con- 
text elements. There are four kinds, identified by the keywords fixes, assumes, 
defines and notes. 



attr-name ::= name \ attribute \ name attribute 

locale-expr ::= locale-exprl ( “+” locale-exprl )* 

locale-exprl ::= ( qualified-name | “(” locale-expr “)” ) ( name | )* 

fixes ::= name [ type ] [ “(” structure “)” | mixfix ] 

assumes ::= [ attr-name ] proposition 

defines ::= [ attr-name ] proposition 

notes ::= [ attr-name “=” ] ( qualified-name [ attribute ] ) + 

element ::= fixes fixes ( and fixes )* 

| assumes assumes ( and assumes )* 
defines defines ( and defines )* 

| notes notes ( and notes )* 
includes locale-expr 
locale ::= element + 

| locale-expr [ “+” element + ] 
in-target ::= “(” in qualified-name “)” 

theorem ::= ( theorem | lemma | corollary ) [ in-target ] [ attr-name ] 
theory-level ::= ... 

| locale name [ “=” locale ] 

| ( theorems | lemmas ) 

[ in-target ] [ attr-name ] ( qualified-name [ attribute ] ) + 
declare [ in-target ] ( qualified-name [ attribute ] ) + 

| theorem proposition proof 
| theorem element* shows proposition proof 

print locale locale 
print .locales 

Fig. 1. Locales extend the grammar of Isar. 

At the theory level — that is, at the outer syntactic level of an Isabelle input file 
locale declares a named locale. Other kinds of locales, locale expressions and 
unnamed locales, will be introduced later. When declaring a named locale, it is 
possible to import another named locale, or indeed several ones by importing 
a locale expression. The second part of the declaration, also optional, consists 
of a number of context element declarations. Here, a fifth kind, includes, is 
available. 

A number of Isar commands have an additional, optional target argument, which 
always refers to a named locale. These commands are theorem (together with 
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lemma and corollary), theorems (and lemmas), and declare. The effect of 
specifying a target is that these commands focus on the specified locale, not the 
surrounding theory. Commands that are used to prove new theorems will add 
them not to the theory, but to the locale. Similarly, declare modifies attributes 
of theorems that belong to the specified target. Additionally, for theorem (and 
related commands), theorems stored in the target can be used in the associated 
proof scripts. 

The Locales package permits a long goals format for propositions stated with 
theorem (and friends). While normally a goal is just a formula, a long goal 
is a list of context elements, followed by the keyword shows, followed by the 
formula. Roughly speaking, the context elements are (additional) premises. For 
an example, see Section 4.4. The list of context elements in a long goal is also 
called unnamed locale. 

Finally, there are two commands to inspect locales when working in interactive 
mode: print_locales prints the names of all targets visible in the current theory, 
print locale outputs the elements of a named locale or locale expression. 

The following presentation will use notation of Isabelle’s meta logic, hence a few 
sentences to explain this. The logical primitives are universal quantification (/\), 
entailment (==*•) and equality (=). Variables (not bound variables) are sometimes 
preceded by a question mark. The logic is typed. Type variables are denoted by 
’a, ’b etc., and => is the function type. Double brackets [ and ] are used to 
abbreviate nested entailment. 



3.2 Parameters, Assumptions, and Facts 

From a logical point of view a context is a formula schema of the form 
/\xi...x„. [ Ci; ... ;C m ] ==> 

The variables xi , . . . , x n are called parameters, the premises Ci , . . . , C„ assump- 
tions. A formula F holds in this context if 

(1) /\xi...x„. [ Ci; ... ;C m ] ==£■ F 

is valid. The formula is called a fact of the context. 

A locale allows fixing the parameters xi , . . . , x n and making the assumptions 
Ci, ... , Cm. This implicitly builds the context in which the formula F can be 
established. Parameters of a locale correspond to the context element fixes, and 
assumptions may be declared with assumes. Using these context elements one 
can define the specification of semigroups. 

locale semi = 

fixes prod :: "[’a, ’a] => ’a" (infixl 70) 
assumes assoc: " (x • y) ■ z = x • (y • z) " 



The parameter prod has a syntax annotation allowing the infix in the as- 
sumption of associativity. Parameters may have arbitrary mixfix syntax, like 
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constants. In the example, the type of prod is specified explicitly. This is not 
necessary. If no type is specified, a most general type is inferred simultaneously 
for all parameters, taking into account all assumptions (and type specifications 
of parameters, if present). 1 

Free variables in assumptions are implicitly universally quantified, unless they 
are parameters. Hence the context defined by the locale semi is 

/\prod. [ f\x y z. prod (prod x y) z = prod x (prod y z) ] => ... 

The locale can be extended to commutative semigroups. 

locale comm_semi = semi + 

assumes comm: "x • y = y • x" 

This locale imports all elements of semi. The latter locale is called the import of 
comm_semi. The definition adds commutativity, hence its context is 

^prod. y z. prod (prod x y) z = prod x (prod y z) ; 

/\x y. prod x y = prod y x ] => ... 

One may now derive facts — for example, left-commutativity — in the context 
of comm_semi by specifying this locale as target, and by referring to the names of 
the assumptions assoc and comm in the proof. 

theorem (in comm_semi) lcomm: 

"x • (y ■ z) = y • (x • z)" 

proof - 

have "x • (y • z) = (x • y) ■ z" by (simp add: assoc) 
also have "... = (y • x) ■ z" by (simp add: comm) 
also have "... = y • (x • z) " by (simp add: assoc) 
finally show ?thesis . 
qed 

In this equational Isar proof, “. . . ” refers to the right hand side of the preced- 
ing equation. After the proof is finished, the fact lcomm is added to the locale 
comm_semi. This is done by adding a notes element to the internal representation 
of the locale, as explained the next section. 



3.3 Locale Predicates and the Internal Representation of Locales 

In mathematical texts, often arbitrary but fixed objects with certain properties 
are considered — for instance, an arbitrary but fixed group G — with the purpose 
of establishing facts valid for any group. These facts are subsequently used on 
other objects that also have these properties. 

Locales permit the same style of reasoning. Exporting a fact F generalises the 
fixed parameters and leads to a (valid) formula of the form of equation (1). 

1 Type inference also takes into account definitions and import, as introduced later. 
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If a locale has many assumptions (possibly accumulated through a number of 
imports) this formula can become large and un-handy. Therefore, Wenzel intro- 
duced predicates that abbreviate the assumptions of locales. These predicates 
are not confined to the locale but are visible in the surrounding theory. 

The definition of the locale semi generates the locale predicate semi over the type 
of the parameter prod, hence the predicate’s type is ([’a, ’a] => ’a) => bool. 
Its definition is 

semi_def : 

semi ?prod = Vx y z. ?prod (?prod x y) z = ?prod x (?prod y z). 

In the case where the locale has no import, the generated predicate abbreviates 
all assumptions and is over the parameters that occur in these assumptions. 
The situation is more complicated when a locale extends another locale, as is the 
case for comm_semi. Two predicates are defined. The predicate comm_semi_axioms 
corresponds to the new assumptions and is called delta predicate, the locale 
predicate comm_semi captures the content of all the locale, including the import. 
If a locale has neither assumptions nor import, no predicate is defined. If a locale 
has import but no assumptions, only the locale predicate is defined. 



The Locales package generates a number of theorems for locale and delta predi- 
cates. All predicates have a definition and an introduction rule. Locale predicates 
that are defined in terms of other predicates (which is the case if and only if the 
locale has import) also have a number of elimination rules (called axioms). All 
generated theorems for the predicates of the locales semi and comm_semi are 
shown in Figures 2 and 3, respectively. 



Theorems generated for the predicate semi. 

semi_def: semi ?prod = Vx y z. ?prod (?prod x y) z = ?prod x (?prod y z) 
semi . intro: 

(y/^x y z. ?prod (?prod x y) z = ?prod x (?prod y z)) =>■ semi ?prod 



Fig. 2. Theorems for the locale predicate semi. 



Note that the theorems generated by a locale definition may be inspected imme- 
diately after the definition in the Proof General interface [1] of Isabelle through 
the menu item “Isabelle/Isar>Show me . . . >Theorems”. 

Locale and delta predicates are used also in the internal representation of lo- 
cales as list of context elements. While all fixes in a declaration generate inter- 
nal fixes, all assumptions of one locale declaration contribute to one internal 
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Theorems generated for the predicate comm_semi_axioms. 
comm_semi_axioms_def : 

comm_semi_axioms ?prod = Vx y. ?prod x y = ?prod y x 
comm_semi_axioms . intro: 

(Ax y. ?prod x y = ?prod y x) =£• comm_semi_axioms ?prod 
Theorems generated for the predicate comm_semi. 

comm_semi_def : comm_semi ?prod = semi ?prod A comm_semi_axioms ?prod 

comm_semi . intro: [semi ?prod; comm_semi_axioms ?prod] =£■ comm_semi ?prod 
comm_semi . axioms: 

comm_semi ?prod =$• semi ?prod 
comm_semi ?prod =>■ comm_semi_axioms ?prod 



Fig. 3. Theorems for the predicates comm_semi_axioms and comm_semi. 



assumes element. The internal representation of semi is 

fixes prod:: "[’a, ’a] => ’a"(inflxl"-"70) 
assumes "semi prod" 

notes assoc : "?x • ?y • ?z = ?x • (?y • ?z) " 

and the internal representation of "comm_semi" is 

fixes prod:: "[’a, ’a] => ’a" (inflxl 70) 
assumes "semi prod" 

notes assoc : "?x • ?y • ?z = ?x • (?y • ?z) " 

(2) 

assumes "comm_semi_axioms prod" 

notes comm : "?x • ?y = ?y • ?x" 

notes lcomm : "?x • (?y • ?z) = ?y • (?x • ?z) " 

The notes elements store facts the locales. The facts assoc and comm were added 
during the declaration of the locales. They stem from assumptions, which are 
trivially facts. The fact lcomm was added later, after finishing the proof in the 
respective theorem command above. 

By using notes in a declaration, facts can be added to a locale directly. Of course, 
these must be theorems. Typical use of this feature includes adding theorems 
that are not usually used as a default rewrite rules by the simplifier to the simpset 
of the locale by a notes element with the attribute [simp] . This way it is also 
possible to add specialised versions of theorems to a locale by instantiating locale 
parameters for unknowns or locale assumptions for premises. 
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3.4 Definitions 

Definitions were available in Kammliller’s version of Locales, and they are in 
Wenzel’s. The context element defines adds a definition of the form p xi ... x„ 
= t as an assumption, where p is a parameter of the locale (possibly an imported 
parameter), and t a term that may contain the xj. The parameter may neither 
occur in a previous assumes or defines element, nor on the right hand side 
of the definition. Hence recursion is not allowed. The parameter may, however, 
occur in subsequent assumes and on the right hand side of subsequent defines. 
We call p defined parameter. 

locale semi2 = semi + 

fixes rprod (infixl "©" 70) 

defines rprod_def : "rprod x y = y • x " 

This locale extends semi by a second binary operation "©" that is like but 
with reversed arguments. The definition of the locale generates the predicate 
semi2, which is equivalent to semi, but no semi2_axioms. The difference between 
assumes and defines lies in the way parameters are treated on export. 



3.5 Export 

A fact is exported out of a locale by generalising over the parameters and adding 
assumptions as premises. For brevity of the exported theorems, locale predicates 
are used. Exported facts are referenced by writing qualified names consisting of 
locale name and fact name — for example, 

semi . assoc: 

semi ?prod =£■ ?prod (?prod ?x ?y) ?z = ?prod ?x (?prod ?y ?z). 

Defined parameters receive special treatment. Instead of adding a premise for 
the definition, the definition is unfolded in the exported theorem. In order to 
illustrate this we prove that the reverse operation "©" defined in the locale 
semi2 is also associative. 

theorem (in semi2) r_assoc: " (x © y) © z = x © (y © z) " 
by (simp only: rprod_def assoc) 

The exported fact is 

semi2.r_assoc: 

semi2 ?prod =>■ ?prod ?z (?prod ?y ?x) = ?prod (?prod ?z ?y) ?x. 

The defined parameter is not present but is replaced by its definition. Note that 
the definition itself is not exported, hence there is no semi2.rprod_def. 2 

2 The definition could alternatively be exported using a let-construct if there was one 
in Isabelle’s meta-logic. Let is usually defined in object-logics. 




42 



C. Ballarin 



4 Locale Expressions 

Locale expressions provide a simple language for combining locales. They are 
an effective means of building complex specifications from simple ones. Locale 
expressions are the main innovation of the version of Locales discussed here. 
Locale expressions are also reason for introducing locale predicates. 



4.1 Rename and Merge 

The grammar of locale expressions is part of the grammar in Figure 1. Locale 
names are locale expressions, and further expressions are obtained by rename 
and merge. 

Rename. The locale expression e qi . . .q n denotes the locale of e where para- 
meters, in the order in which they are fixed, are renamed to qi to q n . The 
expression is only well- formed if n does not exceed the number of parameters 
of e. Underscores denote parameters that are not renamed. Parameters whose 
names are effectively changed lose mixfix syntax, and there is currently no 
way to re-equip them with such. 

Merge. The locale expression e\ + e i denotes the locale obtained by merging 
the locales of ei and e 2 . This locale contains the context elements of ei, 
followed by the context elements of ei. 

In actual fact, the semantics of the merge operation is more complicated. If 
ei and ei are expressions containing the same name, followed by identical 
parameter lists, then the merge of both will contain the elements of those 
locales only once. Details are explained in Section 4.2 below. 

The merge operation is associative but not commutative. The latter is be- 
cause parameters of ei appear before parameters of ei in the composite 
expression. 

Rename can be used if a different parameter name seems more appropriate - 
for example, when moving from groups to rings, a parameter G representing the 
group could be changed to R. Besides of this stylistic use, renaming is important 
in combination with merge. Both operations are used in the following specifica- 
tion of semigroup lromomorphisms. 

locale semi_hom = comm_semi sum + comm_semi + 
fixes hom 

assumes hom: "hom (sum x y) = hom x • hom y" 

This locale defines a context with three parameters sum, prod and hom. Only the 
second parameter has mixfix syntax. The first two are associative operations, 
the first of type [’a, ’a] => ’ a, the second of type [’b, ’b] =4- ’b. 

How are facts that are imported via a locale expression identified? Facts are 
always introduced in a named locale (either in the locale’s declaration, or by 
using the locale as target in theorem), and their names are qualified by the 
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parameter names of this locale. Hence the full name of associativity in semi is 
prod, assoc. Renaming parameters of a target also renames the qualifier of facts. 
Hence, associativity of sum is sum. assoc. Several parameters are separated by 
underscores in qualifiers. For example, the full name of the fact hom in the locale 
semi_hom is sum_prod_hom . hom. 

The following example is quite artificial, it illustrates the use of facts, though. 

theorem (in semi_hom) "hom x • (hom y • hom z) = hom (sum x (sum y z))" 
proof - 

have "hom x • (hom y ■ hom z) = hom y • (hom x • hom z) " 
by (simp add: prod.lcomm) 

also have "... = hom (sum y (sum x z))" by (simp add: hom) 
also have "... = hom (sum x (sum y z))" by (simp add: sum.lcomm) 
finally show ?thesis . 
qed 

Importing via a locale expression imports all facts of the imported locales, hence 
both sum.lcomm and prod.lcomm are available in hom_semi. The import is dynamic 
that is, whenever facts are added to a locale, they automatically become 
available in subsequent theorem commands that use the locale as a target, or 
a locale importing the locale. 



4.2 Normal Forms 



Locale expressions are interpreted in a two-step process. First, an expression is 
normalised, then it is converted to a list of context elements. 

Normal forms are based on locale declarations. These consist of an import sec- 
tion followed by a list of context elements. Let 1(1) denote the locale expression 
imported by locale l. If l has no import then 1(1) = e. Likewise, let T(l) denote 
the list of context elements, also called the context fragment of l. Note that T(l) 
contains only those context elements that are stated in the declaration of l , not 
imported ones. 

Example 1 . Consider the locales semi and comm_semi. We have I(semi) = e and 
Z(comm_semi) = semi, and the context fragments are 



^-"(semi) 



^-"(comm_semi) 



fixes prod:: "[’a, ’a] =>■ ’a" (infixl 70) 
assumes "semi prod" 

notes assoc : "?x • ?y • ?z = ?x • (?y • ?z)" 

assumes "comm_semi_axioms prod" 

notes comm : "?x • ?y = ?y • ?x" 

notes lcomm : "?x • (?y • ?z) = ?y • (?x • ?z) " 



Let 7r 0 (iF(Z)) denote the list of parameters defined in the fixes elements of T(l) 
in the order of their occurrence. The list of parameters of a locale expression 
7r(e) is defined as follows: 
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ir(l) = 7r(I(0) @ 7To(^ r (0)) f° r named locale l. 

7r(e gi . . . <?„) = [qi, ... , q n ,p n +u ■ ■ ■ ,Pm\ , where 7r(e) = [pi, . . . ,p m ]. 

7r(ei + e 2 ) = 7r(ei) @ 7r(e 2 ) 

The operation @ concatenates two lists but omits elements from the second list 
that are also present in the first list. It is not possible to rename more parameters 
than there are present in an expression — that is, n < m — otherwise the 
renaming is illegal. If qi = _ then the ith entry of the resulting list is pi. 

In the normalisation phase, imports of named locales are unfolded, and renames 
and merges are recursively propagated to the imported locale expressions. The 
result is a list of locale names, each with a full list of parameters, where locale 
names occurring with the same parameter list twice are removed. Let Af denote 
normalisation. It is defined by these equations: 

Af(l) = Af(I{l)) @ [l 7 r(Z)], for named locale l. 

A f(e qx . . . q n ) = A f(e) [qi . . ■ q n / n(e)\ 

Af(e i + e 2 ) = Af(e i) @ A/”(e 2 ) 

Normalisation yields a list of identifiers. An identifier consists of a locale name 
and a (possibly empty) list of parameters. 

In the second phase, the list of identifiers A f(e) is converted to a list of context 
elements C(e) by converting each identifier to a list of context elements, and 
flattening the obtained list. Conversion of the identifier l q\ . . . q n yields the list 
of context elements tF(l), but with the following modifications: 

— Rename the parameter in the ith fixes element of J~(l) to q%,i = 1, . . . , n. If 
the parameter name is actually changed then delete the syntax annotation. 
Renaming a parameter may also change its type. 

— Perform the same renamings on all occurrences of parameters (fixed vari- 
ables) in assumes, defines and notes elements. 

— Qualify names of facts by gi_. . . -q n . 

The locale expression is well-formed if it contains no illegal renamings and the 
following conditions on C(e) hold, otherwise the expression is rejected: 

— Parameters in fixes are distinct; 

— Free variables in assumes and defines occur in preceding fixes; 3 

— Parameters defined in defines must neither occur in preceding assumes nor 
defines. 



3 This restriction is relaxed for contexts obtained with includes, see Section 4.4. 




Locales and Locale Expressions in Isabelle/Isar 



45 



4.3 Examples 

Example 2. We obtain the context fragment C(comm_semi) of the locale comm_semi. 
First, the parameters are computed. 

7r(semi) = [prod] 

7r(comm_semi) = 7r(semi) @ [] = [prod] 

Next, the normal form of the locale expression comm_semi is obtained. 

A/"(semi) = [semiprod] 

A/"(comm_semi) = A/"(semi) @ [comm_semi prod] = [semi prod, comm_semi prod] 

Converting this to a list of context elements leads to the list (2) shown in Sec- 
tion 3.3, but with fact names qualified by prod - for example, prod. assoc. 
Qualification was omitted to keep the presentation simple. Isabelle’s scoping 
rules identify the most recent fact with qualified name x.a when a fact with 
name a is requested. 

Example 3. The locale expression comm_semi sum involves renaming. Computing 
parameters yields 7r(comm_semi sum) = [sum], the normal form is 

_/V"(comm_semi sum) = A/"(comm_ semi) [sum/prod] = [semi sum, comm_semi sum] 

and the list of context elements 

fixes sum:: "[’a, ’a] =£■ ’a" 
assumes "semi sum" 

notes sum. assoc : "sum (sum ?x ?y) ?z = sum ?x (sum ?y ?z) " 

assumes "comm_semi_axioms sum" 

notes sum. comm : "sum ?x ?y = sum ?y ?x" 

notes sum.lcomm : "sum ?x (sum ?y ?z) = sum ?y (sum ?x ?z) " 

Example 4- The context defined by the locale semi_hom involves merging two 
copies of comm_semi. We obtain parameter list and normal form: 

7r(semi_hom) = 7r(comm_semi sum + comm_semi) @ [horn] 

= (7r(comm_semi sum) @ 7r(comm_semi)) @ [horn] 

= ([sum] @ [prod]) @ [horn] = [sum, prod, horn] 

A/"(semi_hom) = A/"(comm_semi sum + comm_semi) @ 

[semi_hom sum prod horn] 

= (A/"(comm_semi sum) @ A/"(comm_semi)) @ 

[semi_hom sum prod horn] 

= ([semi sum, comm_semi sum] @ [semi prod, comm_semi prod]) @ 
[semi_hom sum prod horn] 

= [semi sum, comm_semi sum, semi prod, comm_semi prod, 
semi_hom sum prod horn] . 
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Hence C(semi_hom), shown below, is again well-formed. 

fixes sum:: "[’a, ’a] =£■ ’a" 
assumes "semi sum" 

notes sum. assoc : "sum (sum ?x ?y) ?z = sum ?x (sum ?y ?z) " 

assumes "comm_semi_axioms sum" 

notes sum. comm : "sum ?x ?y = sum ?y ?x" 

notes sum.lcomm : "sum ?x (sum ?y ?z) = sum ?y (sum ?x ?z) " 

fixes prod:: "[’b, ’b] =>■ ’b" (infixl 70) 

assumes "semi prod" 

notes prod. assoc : "?x • ?y • ?z = ?x • (?y • ?z) " 

assumes "comm_semi_axioms prod" 

notes prod, comm : "?x • ?y = ?y • ?x" 

notes prod.lcomm: "?x • (?y • ?z) = ?y • (?x • ?z) " 

fixes horn :: " ’a =>■ ’b" 

assumes "semi_hom_axioms sum" 

notes sum_prod_hom.hom : horn (sum x y) = horn x • hom y 

Example 5. In this example, a locale expression leading to a list of context 
elements that is not well-defined is encountered, and it is illustrated how nor- 
malisation deals with multiple inheritance. Consider the specification of monads 
(in the algebraic sense) and monoids. 

locale monad = 

fixes prod :: "[’a, ’a] => ’a" (infixl 70) and one :: ’a (" 1 " 100) 
assumes l_one: "1 • x = x" and r_one: "x • 1 = x" 

Monoids are both semigroups and monads and one would want to specify them as 
locale expression semi + monad. Unfortunately, this expression is not well-formed. 
Its normal form 

Af(monad) = [monad prod] 

jV"(semi + monad) = A/"(semi) @ A/"(monad) = [semi prod, monad prod] 

leads to a list containing the context element 

fixes prod :: "[’a, ’a] =*- ’a" (infixl 70) 

twice and thus violating the first criterion of well-formedness. To avoid this 
problem, one can introduce a new locale magma with the sole purpose of fixing the 
parameter and defining its syntax. The specifications of semigroup and monad 
are changed so that they import magma. 

locale magma = fixes prod (infixl 70) 

locale semi’ = magma + assumes assoc: " (x • y) • z = x • (y • z)" 
locale monad’ = magma + fixes one ("1" 100) 

assumes l_one: "1 • x = x" and r_one: "x • 1 = 



x 
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Normalisation now yields 

A/"(semi’ + monad’) = Af(semi’) @A/"(monad’) 

= (A/"(magma) @ [semi’ prod]) @ (.A/" (magma) @ [monad’ prod]) 
= [magma prod, semi’ prod] @ [magma prod, monad’ prod]) 

= [magma prod, semi’ prod, monad’ prod] 

where the second occurrence of magma prod is eliminated. The reader is encour- 
aged to check, using the print.locale command, that the list of context elements 
generated from this is indeed well-formed. 

It follows that importing parameters is more flexible than fixing them using a 
context element. The Locale package provides the predefined locale var that 
can be used to import parameters if no particular mixfix syntax is required. Its 
definitions is 



locale var = fixes x_ 

The use of the internal variable x_ enforces that the parameter is renamed before 
being used, because internal variables may not occur in the input syntax. 



4.4 Includes 

The context element includes takes a locale expression e as argument. It can 
occur at any point in a locale declaration, and it adds C(e) to the current context. 
If includes e appears as context element in the declaration of a named locale 
l, the included context is only visible in subsequent context elements, but it is 
not propagated to l. That is, if l is later used as a target, context elements from 
C(e ) are not added to the context. Although it is conceivable that this mecha- 
nism could be used to add only selected facts from e to l (with notes elements 
following includes e), currently no useful applications of this are known. 

The more common use of includes e is in long goals, where it adds, like a target, 
locale context to the proof context. Unlike with targets, the proved theorem is 
not stored in the locale. Instead, it is exported immediately. 

theorem lcomm2: 

includes comm_semi shows "x • (y • z) = y • (x • z) " 

proof - 

have "x • (y • z) = (x • y) ■ z" by (simp add: assoc) 
also have "... = (y • x) ■ z" by (simp add: comm) 
also have "... = y • (x • z) " by (simp add: assoc) 
finally show ?thesis . 
qed 

This proof is identical to the proof of lcomm. The use of includes provides the 
same context and facts as when using comm_semi as target. On the other hand, 
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lcomm2 is not added as a fact to the locale comm_semi, but is directly visible in 
the theory. The theorem is 

comm_semi ?prod =$- ?prod ?x (?prod ?y ?z) = ?prod ?y (?prod ?x ?z). 

Note that it is possible to combine a target and (several) includes in a goal 
statement, thus using contexts of several locales but storing the theorem in only 
one of them. 

5 Structures 

The specifications of semigroups and monoids that served as examples in pre- 
vious sections modelled each operation of an algebraic structure as a single pa- 
rameter. This is rather inconvenient for structures with many operations, and 
also unnatural. In accordance to mathematical texts, one would rather fix two 
groups instead of two sets of operations. 

The approach taken in Isabelle is to encode algebraic structures with suitable 
types (in Isabelle/HOL usually records). An issue to be addressed by locales is 
syntax for algebraic structures. This is the purpose of the (structure) annota- 
tion in fixes, introduced by Wenzel. We illustrate this, independently of record 
types, with a different formalisation of semigroups. 

Let ’a semi_type be a not further specified type that represents semigroups over 
the carrier type ’a. Let s_op be an operation that maps an object of ’a semi_type 
to a binary operation. 

typedecl ’a semi_type 

consts s_op :: "[’a semi_type, ’a, ’a] =>■ ’a" (infixl "*i " 70) 

Although s_op is a ternary operation, it is declared infix. The syntax annotation 
contains the token i (\<index>), which refers to the first argument. This syntax 
is only effective in the context of a locale, and only if the first argument is a 
structural parameter — that is, a parameter with annotation (structure). The 
token has the effect of replacing the parameter with a subscripted number, the 
index of the structural parameter in the locale. This replacement takes place 
both for printing and parsing. Subscripted 1 for the first structural parameter 
may be omitted, as in this specification of semigroups with structures: 

locale comm_semi ’ = 

fixes G :: "’a semi_type" (structure) 

assumes assoc: " (x * y) * z = x ★ (y * z) " and comm: "x * y = y * x" 

Here x * y is equivalent to x u y and abbreviates s_op G x y. A specification 
of homomorphisms requires a second structural parameter. 

locale semi ’ _hom = comm_semi’ + comm_semi’ H + 
fixes horn 

assumes horn: "horn (x * y) = horn x *2 horn y" 

The parameter H is defined in the second fixes element of C(semi’_comm). Hence 
*2 abbreviates s_op H x y. The same construction can be done with records 
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instead of an ad-hoc type. In general, the itli structural parameter is addressed 
by index i. Only the index 1 may be omitted. 

record ’a semi = prod :: "[’a, ’a] =£■ ’a" (infixl "-z" 70) 

This declares the types ’a semi and (’a, ’b) semi_scheme. The latter is an ex- 
tensible record, where the second type argument is the type of the extension 
field. For details on records, see [7] Chapter 8.3. 

locale semi_w_records = struct G + 

assumes assoc: " (x • y) • z = x • (y • z) " 

The type (’a, ’b) semi_scheme is inferred for the parameter G. Using subtyping 
on records, the specification can be extended to groups easily. 

record ’a group = "’a semi" + 
one :: "’a" ("li" 100) 
inv :: "’a =£■ ’a" ("invi [81] 80) 
locale group_w_records = semi_w_records + 

assumes l_one: "1 • x = x" and l_inv: "inv x • x = 1" 

Finally, the predefined locale 

locale struct = fixes S_ (structure). 

is analogous to var. More examples on the use of structures, including groups, 
rings and polynomials can be found in the Isabelle distribution in the session 
HOL- Algebra. 



6 Conclusions and Outlook 

Locales provide simple means of modular reasoning. They allow to abbreviate 
frequently occurring context statements and maintain facts valid in these con- 
texts. Importantly, using structures, they allow syntax to be effective only in 
certain contexts, and thus to mimic common practice in mathematics, where 
notation is chosen very flexibly. This is also known as literate formalisation [2]. 
Locale expressions allow to duplicate and merge specifications. This is a neces- 
sity, for example, when reasoning about homomorphisms. Normalisation makes 
it possible to deal with diamond-shaped inheritance structures, and generally 
with directed acyclic graphs. The combination of locales with record types in 
higher-order logic provides an effective means for specifying algebraic structures: 
locale import and record subtyping provide independent hierarchies for specifica- 
tions and structure elements. Rich examples for this can be found in the Isabelle 
distribution in the session HOL- Algebra. 

Primary reason for writing this report was to provide a better understanding of 
locales in Isar. Wenzel provided hardly any documentation, with the exception 
of [9]. The present report should make it easier for users of Isabelle to take 
advantage of locales. 
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The report is also a base for future extensions. These include improved syntax 
for structures. Identifying them by numbers seems not natural and can be con- 
fusing if more than two structures are involved — for example, when reasoning 
about universal properties — and numbering them by order of occurrence seems 
arbitrary. Another desirable feature is instantiation. One may, in the course of 
a theory development, construct objects that fulfil the specification of a locale. 
These objects are possibly defined in the context of another locale. Instantiation 
should make it simple to specialise abstract facts for the object under consider- 
ation and to use the specified facts. 

A detailed comparison of locales with module systems in type theory has not 
been undertaken yet, but could be beneficial. For example, a module system 
for Coq has recently been presented by Clrrzaszcz [3]. While the latter usually 
constitute extensions of the calculus, locales are a rather thin layer that does 
not change Isabelle’s meta logic. Locales mainly manage specifications and 
facts. Functors, like the constructor for polynomial rings, remain objects of the 
logic. 

Acknowledgement. Lawrence C. Paulson and Norbert Sclrirmer made useful 
comments on a draft of this paper. 



References 

1. David Aspinall. Proof general: A generic tool for proof development. In Susanne 
Graf and Michael I. Schwartzbach, editors, TACAS 2000 , number 1785 in LNCS, 
pages 38-42. Springer, 2000. 

2. Anthony Bailey. The machine- checked literate formalisation of algebra in type 
theory. PhD thesis, University of Manchester, January 1998. 

3. Jacek Chrzaszcz. Implementing modules in the Coq system. In David Basin and 
Burkhart Wolff, editors, TPHOLs 2003, number 2758 in LNCS, pages 270-286. 
Springer, 2003. 

4. Florian Kammuller. Modular reasoning in Isabelle. In David McAllester, editor, 
CADE 17, number 1831 in LNCS, pages 99-114. Springer, 2000. 

5. Gerwin Klein. Verified Java Bytecode Verification. PhD thesis, Institut fur Infor- 
matik, Technische Universitat Miinchen, 2003. 

6. Tobias Nipkow. Structured proofs in Isar/HOL. In H. Geuvers and F. Wiedijk, 
editors, TYPES 2002, number 2646 in LNCS, pages 259-278. Springer, 2003. 

7. Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. Isabelle/HOL: A Proof 
Assistant for Higher-Order Logic. Number 2283 in LNCS. Springer, 2002. 

8. Markus Wenzel. Isabelle/Isar — a versatile environment for human-readable for- 
mal proof documents. PhD thesis, Technische Universitat Miinchen, 2002. Elec- 
tronically published as 

http://tumbl.biblio.tu-muenchen.de/publ/diss/in/2002/wenzel.html. 

9. Markus Wenzel. Using locales in Isabelle/Isar. Part of the Isabclle2003 
distribution, file src/HOL/ex/Locales.thy. Distribution of Isabelle available at 
http://isabelle.in.tum.de, 2002. 

10. Markus Wenzel. The Isabelle/Isar reference manual. Part of the Isabclle2003 
distribution, available at http://isabelle.in.tum.de, 2003. 




Introduction to PAF!, a Proof Assistant 
for ML Programs Verification 



Sylvain Baro 



PPS - CNRS UMR 7126 
Universite Denis Diderot 
Case 7014 
2, Place Jussieu 
75251 PARIS Cedex 05 



Abstract. We present here a proof assistant dedicated to the proof of 
ML programs. This document is oriented from a user’s point of view. 
We introduce the system progressively, describing its features as they 
become useful, and justifying our choices all along. 

Our system intends to provide a usual predicate calculus to express and 
prove properties of functional ML terms including higher order functions 
with polymorphic types. To achieve this goal, functional expressions are 
embedded in the logic as first class terms, with their usual syntax and 
evaluation rules. 



The purpose of this paper is to introduce the reader to PAF!, a proof assistant 
dedicated to the verification of properties of programs written in the functional 
core of the ML language. More precisely, we will put emphasis on the main 
features of our system: the convenience and expressive power of our logic to write 
programs and their specifications; and the innovative design of our interactive 
proof assistant , which allows a high level of interactivity and which simplifies the 
integration of new tactics into the system. 

These two aspects represent our contribution to both the fields of program 
verification and interactive theorem proving. 

Our system is formalised so as to allow formulas to express and prove prop- 
erties of ML terms. It has two levels: a programming language and a logical 
language. The programming language is a strictly functional ML which includes: 
algebraic datatypes, functions defined by pattern matching, recursion and poly- 
morphism. Furthermore, it allows definition of partial functions. The logical lan- 
guage combines a multisorted classical predicate calculus, whose sorts are the 
ML datatypes and whose first class terms are ML terms, with a dedicated proof 
language. 

Like in ACL2 or nqthm [9,3], in PAF! formulas assert properties of programs, 
but we chose not to encode logic in the programming language. In a sense, we 
are closer to PVS [14] which mixes an extended lambda-calculus with higher 
order predicate calculus, but our term language has both the syntax and the 
evaluation rules of ML programs. Like PVS, our system allows to handle par- 
tial functions, but without using subtyping. As in Type Theory [13], we have 
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inductive datatypes available, but not dependent ones (except for type polymor- 
phism), and datatypes can not embed any logical content. As opposed to Type 
Theory, proofs do not need to lead to programs, hence we do not need to remain 
within a constructive framework. 

In the world of design and implementation of proof assistants in general, we 
think our approach to be novel. We are guided by the concern of being able 
to offer the user “full page” edition capabilities for the development of theories 
(programs and their intended properties) and proofs. This lead us to conceive 
the architecture of our system like a dialogue between a proof engine in charge 
of the logical part (in particular, of the validity of each stage of the proof) and 
a (graphical) user interface, handling edition. Both communicates in a client - 
server fashion through a dedicated proof engine protocol [11]. This aspect of our 
work was carried out in collaboration with the working group MatlrOS of the 
laboratory PPS. 

Stress was also put on the extension capabilities of our system. We tried 
to ease as much as possible the writing of new tactics, without jeopardising the 
correctness of the system. We use an original architecture, mixing object oriented 
programming and functional programming (made possible by the use of Objective 
Caml). This architecture satisfies De Bruijn’s criterion which requires that a 
proof assistant should be able to generate a proof object in a simpler formalism, 
amenable to double-checking. The current version of the proof engine, including 
tactics, consists of more than 15000 Objective Caml lines of code. 

We are convinced that program certification tools should be usable by pro- 
grammers, and not only by computer scientists: programs should be written as 
usual. We think that, when we intend to write a certified piece of software, it 
is sensible to use a restriction of the programming language, but we must stay 
in the same formalism. This lead us to the credo that guided all the design of 
our proof assistant: WYSIWIP, for “What You See Is What You Prove”: in our 
system, what the user appears to prove is actually what he proves. There is no 
encoding in a hidden logical framework, which would lead him to prove things 
which seems (to him) unrelated to his problem, urging him to make the effort 
to understand the underneath part of the system. 

The first section shows how to write programs in our framework. Section 2 
discusses the issue of program termination. Section 3 and 4 present the specifica- 
tion language and the proof language. The last part of this paper (Section 5 and 
6) presents the graphical user interface and key details of the implementation of 
the system. We conclude with a short comparison with other systems. 

Simple examples are used throughout this paper to illustrate our discourse. 
The interested reader might find more complicated ones in [1]. 

1 ML Functional Core 

1 . 1 Language 

In the following, we use examples to introduce the reader to our system. Com- 
mands and answers of the system are given nearly verbatim. The real syntax, 
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which is very close, depends of the user interface. Either it is the toplevel inter- 
face, and the commands are embedded in simple Objective Caml function calls 
(for example, the tactic Intro is called with p "Intro" ; Or it is the graph- 
ical user interface, and the commands are typed directly in the placeholders 
(boxes) or through the use of buttons and menus. 

A beginner’s example. Assume that Nil and Cons are the two constructors of 
polymorphic lists. The definition of the append function may be written as fol- 
lows: 

let rec append 11 12 = 
match 11 with 

Nil -> 12 

I Cons(x,ll) -> Cons (x, (append 11 12)) 

This is the recursive definition of a function using pattern matching facilities. 
From this definition, knowing the types of Nil and Cons, the system infers the 
expected ML type for append: ([’a] list -> ([’a] list -> [’a] list)). 

From the formal point of view, the ML type assigned to append is a sort. 
This is a syntactical property of the definition and does not means that append 
will map any pair of lists to a list (i.e. is total). The totality of a function can be 
asserted apart from the definition itself: the system provides a special command 
for this, which will be discussed in section 2. 

This distinction between, let’s say, “syntactical” and “logical” types allows 
one to define partial functions using partial pattern matching , as in: 

let head x = match x with Cons(x,l) -> x 

This function is typed by the system as [ ’ a] list -> ’a, despite it is a partial 
function. Defining partial functions is allowed in our system, using non exhaus- 
tive pattern matching, because we use more than a mere case operator. 

Datatypes. Let us now define a new datatype for binary trees: 

type [’a] tree = Leaf of ’a I Node of [’a] tree * [’a] tree 

This definition introduces a new polymorphic sort [ ’ a] tree and two construc- 
tor symbols Leaf and Node together with their respective sorts ’a -> [’a] 
tree and ([’a] tree * [’a] tree) -> [’a] tree. At the logical level, the 
sort assignments for constructors are interpreted as introduction rules for the 
corresponding inductive type and a second order formula is generated, which 
corresponds to the elimination rule of this type (its structural induction princi- 
ple). This view is close to the ones in [12,10,16]. 

Using our new data structure, one can define a traversal function which maps 
a tree to a list of its labels: 

let rec toList t = 
match t with 
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Leaf(x) -> Cons(x,Nil) 

I Node(Leaf (x) ,t2) -> Cons(x, (toList t2)) 

I Node(Node(tl ,t2) ,t3) -> (toList Node(tl,Node(t2,t3))) 

The function toList has type [’a] tree -> [’a] list. 

1.2 Reduction, Evaluation, and Equal 

We set the semantics of our programming language with its reduction rules which 
describe the way how terms reduce to values. Our reduction rules are given by 
the natural semantics of ML ([4]), which is a weak call-by- value reduction. 

However, there is a difference between evaluation in a programming lan- 
guage scope and evaluation in a logical scope: in the latter one, we might find 
logical variables in terms. For instance the pattern matching expression (match 
Leaf(t) with Leaf (x) -> tl I z -> t2) evaluates to tl[t/x] while (match 
x with Leaf(x) -> tl I z -> t2) does not evaluate: it is considered as a 
weak normal form. The only case where a head formal variable does not stop 
the reduction of a pattern matching expression has the form (match x with z 
-> u I . . .) which evaluates to u[x/z]. 

Reduction is built-in and is axiomatised as a relation between terms. But 
when reasoning about programs, one may need to check equality between terms. 
The only predefined equality provided is the built-in equality at the programming 
level: the polymorphic operator = of ML. Its semantics is given by the following: 

— C(tl) = C(t2) evaluates to the boolean ML value true if C is a constructor 
symbol and tl = t2 evaluates to true; 

— Cl(tl) = C2(t2) evaluates to false if Cl and C2 are distinct constructors; 

— t = t evaluates to true, even if t is a variable. 

— All the other equational terms are considered as being in normal form. This 
is in particular the case for x = y where x and y are variables. 

2 To Be Total (Or Not to Be Total) 

An often needed property of functions involved in a program development is 
termination. In our system, among others, termination can be established by 
proving that the function is totally defined on its intended type. In this section 
what it means in our framework to be “totally defined on the intended type”. 
The presentation of the construction of the proofs is delayed until Section 4. 

A First Example. Let us consider again the append function. The user prompts 
the system for proving its totality by the following command: 

Total append 

Then the system urges the user to prove the following: 

I- 

(Forall ’a: type (Forall x: [ J a] list (Forall y: [’a] list 

((append x y) : [ J a] list))) 
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How should we read this formula ? Syntactically, it consists in three bound 
universal quantifiers and ends with the type assignment ((append x y) : [’a] 
list) . First, we must distinguish between two kinds of use of universal bindings: 
Forall ’a: type on one side and the two Forall x: [’a] list, Forall y: 
[’el] list on the other side. 

The Forall ’a: type binding introduces ’a as a type parameter which can 
be used in the subformula. The quantification is needed to make sure that the 
’ a in the two following bindings are the same type parameter. 

But despite the use of the Forall keyword, Forall ’a: type is not a gen- 
uine logical quantifier (we are definitively not in Type Theory). It is merely a 
syntactical construct to mark that only ML types expressions may be substi- 
tuted to the parameter ’ a. So neither formulas nor terms can be substituted 
here, which excludes dependent types. 

The Forall x: [’a] list binding may be read as the usual syntactical 
shortcut for bound quantification. From the logical point of view, a formula as 
(Forall x:tau F) where tau is an ML type expression and F a formula is ex- 
pended to (Forall x (tau:x) -> F) where -> is the propositional implication. 

The explanation we gave for the second kind of bound quantifier leads us 
to make precise the meaning of what we write as the type assignments: x : [ ’ a] 
list or ((append x y) : [’a] list). Formally, it is a schematic atomic formula 
where : [’a] list is the predicate symbol and x and (append x y) are the 
arguments. The intended semantics of such a predicate is that its argument has 
sort [’a] list (for the ML type discipline) and that it terminates. In other 
words, ((append x y) : [’a] list) means that (append x y) will evaluate 
to a list. The formal setting of this use of type assignment is given in [2] 1 and is 
called strong typing. 

To end the commentary about this first example, let us mention that, when- 
ever a Forall x: [’a] list quantifier is instantiated with some term t, it is 
required to prove that t: [’a] list holds. But having proved the totality of 
all functions allows the proof engine to prove automatically most of these state- 
ments of termination. For example if the user needs to prove that a term t 
terminates, and t contains a call to append, but no hxpoints, neither partial 
pattern matching, and the totality of append has been previously proved, then 
the statement will be proved automatically using type inference. This allows the 
user to ignore nearly all the typing proofs that are commonly needed in most 
theorem provers. 

Higher Order Functions. Let us define now the higher order function map: 

let rec map f 1 = 

match 1 with 
Nil -> Nil 

I Cons(x,xs) -> Cons((f x) , (map f xs)) 



1 We apologise that its English version is not yet available. . . 
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The command Total map will produce the following proof obligation: 

I- 

(Forall ’a : type (Forall ’b : type (Forall f: ’a -> J b 
(Forall 1: [’a] list ((map f 1): [ J b] list))))) 

The novelty here is the type restriction f : ’a -> ’b, which entails that map will 
be total as long as it is applied to functions that are themselves total. It is thus 
not allowed to apply map to the partial head function. 

Partial terms do not deserve a particular treatment: if one wants to prove, 
say Odd (predecessor 0) (where the predecessor is partially defined and weakly 
typed nat -> nat), we are stuck because (predecessor 0) cannot be reduced. 

3 Specification Language 

It is now time to express more intentional properties on the defined functions. 
PAF! uses a usual vernacular language which comes with function and type 
definitions. We have already met the Total command, we also use the commands 
Declare, Definition, Axiom and Theorem. 

Predicates. Let us begin with a simple example: the predicate Mem(x,l) which 
expresses membership of x to the list 1. 

We first introduce the predicate symbol Mem together with its sort: 

Declare Mem : (’a * [’a] list) -> Prop 

where Prop denotes the sort of logical propositions. We forbid the sort Prop 
on the left hand side of a sort arrow. Thus we do not have full higher order logic 
but only the fragment needed to assert about (possibly higher order) functions. 
The intended meaning of the Mem predicate may be given by the two axioms: 

Axiom MemB : 

(Forall ’a: type 

(Forall x: ’a (Forall 1: [’a] list Mem(x,Cons(x,l))))) 

Axiom MemRec : 

(Forall ’a: type (Forall x: ’a (Forall y: ’a (Forall 1: [’a] list 

( Mem(x,l) -> Mem (x, Cons (y,l)) ))))) 

Now, if memL is the ML boolean function testing the list membership, one 
can state the following lemma to prove the correctness of memL w.r.t. its logical 
specification given by the Mem relation with its two axioms: 

Theorem : 

(Forall ’a: type (Forall x: ’a (Forall 1: [’a] list 
('(memL x 1) -> Mem(x,l))))) 

Notice that in the above formula, the backquote character in the premise 
' (memL x 1) is a built-in predicate of sort bool -> Prop which maps ML 
boolean values to logical truth values. We have two rules for this predicate: 
(1) ‘ (true) is always true; (2) under the hypothesis ‘ (false), anything is true. 
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Equations. A second use of this “boolean promotion” is to set equations. For 
instance, from the append definition, one can prove that: 

Theorem append_assoc : 

(Forall ’a : type 

(Forall x: [’a] list (Forall y: [’a] list (Forall z: [’a] list 
'((append (append x y) z) = (append x (append y z) )))))) 

Of course, equations may be guarded as in: 

Theorem mem_not_eq: 

(Forall ’a:type (Forall x:’a (Forall y:’a (Forall l:[’a] list 
Not(Mem(x,l)) -> '((mem x Cons(y,l)) = (x=y)))))) 

This is why equality is not necessarily required at the logical level of our frame- 
work. Indeed, for ML program verification, one mainly wants to set equations 
between ML values and this is obtained by combining the ML built-in equality 
function with the backquote predicate. 

A last example. Let us end the presentation of the specification language with 
an example about the toList function defined in section 1. The aim is to express 
the correctness of this function by something like: x belongs to the computed list 
(toList t) iff it belongs to t. 

Let us first define membership in trees: 

let rec memT x t = 
match t with 
Leaf (y) -> (x = y) 

I Node(tl,t2) -> (or (memT x tl) (memT x t2)) 

Total memT 

We may now state that values in the tree will be in the list obtained by applying 

toList. 

Theorem memL_memT : 

(Forall ’a : type (Forall x : ’a (Forall t : [’a] tree 

'( (memL x (toList t)) = (memT x t) ) ))) 



4 Proofs 

The interaction with the proof assistant is rather standard: the proof engine 
gives a sequent (a proof goal) to the user, who answers by typing in a proof 
command together with its arguments (a tactic). Then, the proof engine applies 
the given tactic to the aimed goal and asks again the user to prove the possible 
subgoals until no more remain. With the command line interface, the proof is 
seen as a script , as in most proof assistants, but in our graphical interface, the 
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tree structure is kept, allowing the user to prove the goals in any order, and to 
leave holes in the proof, to be filled later on. 

The current implementation of PAF! does not provide a lot of sophisticated 
tactics but the user may already find: (1) Tactics allowing purely logical reason- 
ing, including most of the natural deduction rules and left rules of the sequent 
calculus (to ease the reasoning on hypothesis) ; (2) tactics allowing the use of hy- 
pothesis, axioms, theorems and equations. Note that, in the interactive process 
of building proofs, a theorem can be used before its proof has been achieved; 
(3) tactics for opening functions definitions, using reduction relation and strong 
typing , since we are concerned by programming features; (4) higher level tactic 
with more or less automation for structural induction, since we are concerned 
by programs over inductive datatypes. 

These tactics are built on top of a set of primitive rules based on Free Deduc- 
tion (see [17]) to which we add dedicated rules to handle the ML terms reduction 
and the strong typing. One of the most useful rule allows us to substitute a term 
u for a term t in a formula whenever t reduces to u. 



Example 1. We first prove the associativity of append using the induction tactic. 
We assume that append has been proved total. In answer to the command 

Theorem assoc_append : 

(Forall ’a : type 

(Forall 11: [’a] list (Forall 12: [’a] list (Forall 13: [’a] list 
‘((append (append 11 12) 13)) = (append 11 (append 12 13))))))) 

the proof engine prompts the user with the corresponding sequent: 

I- 

(Forall ’a: type 

(Forall 11: [’a] list (Forall 12: ['a] list (Forall 13: [’a] list 
‘(((append (append 11 12) 13)= (append 11 (append 12 13)))))))) 

Then the user simply enters: 

Byinduction 

and the proof engine answers that everything is Proved. 

One should not be too much enthusiastic about the level of automation of 
the Byinduction tactic. It succeeds here because the proof of associativity of 
append involves only one induction, one function, constructors and very simple 
rewriting (or reduction). Note that without parameter the tactic Byinduction 
tries the induction on the first universally bound term variable in the formula. 

When it fails to automatically solve its given goal, the Byinduction tac- 
tic gives the hand back to the user with as many subgoals as required by the 
structural induction. 
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Example 2. Let us prove the totality of toList, met in Section 1. The goal is: 

I- 

(Forall ’a:type (Forall t:[’a] tree ((toList t):[’a] list))) 

Using By Induct ion on this goal will produce the two subgoals corresponding to 
the two structural induction cases for our binary trees. 

The first subgoal, corresponding to the Leaf constructor, is 

I- 

(Forall xl5:’a ((toList Leaf (xl5) ) : [ ’ a] list)) 

It is solved by our simple Auto tactic. 

The second subgoal, corresponding to the Node constructor, is 



(Forall xl6:[’a] tree (Forall xl7:[’a] tree 

(((toList xl6):[’a] list) -> (((toList xl7):[’a] list) 

-> ((toList Node (xl6 ,xl7) ) : [’a] list))))) 

To solve it, some preliminary work is needed. Let us give the sequence of tactics 
without displaying the intermediate subgoals: 

Intron 3 
Generalize xl7 
Generalize xl6 

That lead us to the subgoal: 

[H3 : ((toList xl6):[’a] list)] 

H2 : (xl7:[’a] tree) 

HI : (xl6 : [ 1 a] tree) 

I- 

(Forall xl6:[’a] tree (Forall xl7:[’a] tree 
(((toList xl7):[’a] list) 

-> ((toList Node (xl6 ,xl7) ) : [’a] list)))) 

The first tactic call (intron 3) introduces 3 premisses of the formula in the 
hypothesis part of the sequent (above the sign I -). It corresponds to the logical 
introduction rules of natural deduction. The two latter ones correspond to the 
elimination rule of the universal quantifier, we use them to get the right induction 
formula. Note that the strong typing proof obligations due to bound quantifier 
elimination have been automatically solved. This proof is completed by one more 
use of the Byinduction tactic. 



Example 3. The last example illustrates what remains to be done in order to im- 
prove automation on really trivial proofs. We prove totality of the memL function, 
met in Section 3. The goal to prove is: 
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(Forall ’a: type 

(Forall x:’a (Forall 1 : [ ’ a] list ( (memL x l):bool)))) 

We proceed by induction on 1: 

Byinduction 1 

The Nil case is solved by Auto, thus we omit it. The second subgoal is: 

I- 

(Forall xl5:’a (Forall xl6:[’a] list 
((Forall x:’a ((memL x xl6):bool)) 

-> (Forall x:’a ((memL x Cons(xl5,xl6)) :bool))))) 

Remark that, since we have given an argument to the Byinduction tactic, it 
builds the induction formula keeping the Forall x: ’a binding. Here the tactics 
fails to prove automatically the subgoals. We thus need to proceed by hand. 
First, we introduce the hypothesis with the tactic: 

Intros 

and get 

[H6 : (x: ’a)] 

H5 : (Forall x:’a ((memL x xl6):bool)) 

H4 : (xl6 : [ 1 a] list) 

H3 : (xl5 : ’ a) 

I- 

( (memL x Cons(xl5,xl6)) :bool) 

Let us open the definition of memL and evaluates the resulting term. 
OpenAndEval memL 
Our goal is now 
[H6 : (x: ’a)] 

H5 : (Forall x:’a ((memL x xl6):bool)) 

H4 : (xl6 : [ 1 a] list) 

H3 : (xl5:’a) 

I- 

((or (x=xl5) (memL x xl6)):bool) 

We now have to use the fact that the boolean function or is total and to apply 
the theorem or_total, previously generated by Total or: 

Apply or_total 

Then we get as a last subgoal: 
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[H6 : (x: ’a)] 

H5 : (Forall x:’a ( (memL x xl6):bool)) 

H4 : (xl6 : [ 1 a] list) 

H3 : (xl5 : ’ a) 

I- 

( (memL x xl6):bool) 

This subgoal is solved by Auto (using induction hypothesis). The strong typing 
statement (x=xl5) :bool has been solve by Apply and is not even displayed to 
the user, which illustrates what have been told in Section 2. 



5 Interfaces 

For the moment, two interfaces are usable with PAF!. The first one (of lower level) 
is a command line interface, which is in fact an Objective Caml toplevel extended 
with functions which allow us to insert new vernacular and proof commands, as 
well as simple functions for managing the proof. 

The second interface (still under heavy development), which unleash the 
full dynamic power of our proof engine, is a graphical user interface written 
in Pytlron/Tkinter by Yves Legrandgerard. This interface allows us to insert 
vernacular commands ( let , type, definitions, declarations, axioms or theorems) 
at any point in the session, thus allowing full page edition. Proofs may also be 
edited full page, the interface being aware of their tree-like nature. The user may 
ignore a proof, or build only part of it, leaving some holes in, or delete parts of 
it. Goals may be fulfilled in whatever order suits to the user. 

This interface is not a simple emulation of full page edition through the use 
of command line together with undo , but communicates with the proof engine 
through a dedicated Proof Engine Protocol [11,1] over TCP. This protocol is 
synchronous: the interface issues a request then waits for the answer of the proof 
engine. On the other hand, the state of the interface is different of the state 
of the proof engine. Requests are only sent when the user asks for it, and only 
validated elements (vernacular or proof steps) are known by the proof engine. 
For example the user may invalidate a particular element, then a request is issued 
to the proof engine, which suppresses it, if allowed. But the interface keeps this 
element in an invalidated state until the user either deletes it, or modifies then 
revalidates it. 

Figure 1 shows an example where the user forgot to define a function and 
had to come back at the beginning to insert it. Some commands have already 
been issued and accepted by the prover, and proofs have been left apart (and 
fold) by the user. The colors of the boxes and of the vertical thin lines change 
following the status of the box: validated, invalidated or false (the reader should 
take it for truth, since some information gets lost through the use of the printer) . 
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Fig. 1. Inserting a new definition above 



6 Implementation 

We already mentioned some features of our system: dynamic capabilities, the 
ability to be extended and the satisfaction of De Bruijn’s criterion. All this 
features required to use an innovative architecture for the proof engine. We give 
here hints on how this was achieved. More details may be found in [1], 

6.1 Dynamic . . . 

In order to be able to insert or suppress session elements (vernacular commands) 
or proof elements in a dynamic fashion (as seen in Section 5), we can not use a 
simple architecture with one context maintaining the state after the last issued 
command. We need to use distributed contexts: contexts need to be embedded 
into each session and into each proof element, and to be linked back to the 
previous context following the lexical scope of the session: a session elements 
knows everything which has been declared in other session elements higher in 
the session, similarly, a proof element knows everything which is in the upper 
part of the proof, and everything which comes before the theorem it proves. 

Inserting and deleting is then just a matter of linking and unlinking contexts, 
while checking and updating a dependency tree. 
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6.2 . . . Easy to Extend . . . 

The future of a proof assistant lies in its evolution capability. We wanted ours to 
be as easy to extend as possible. Of course, writing a heavy automatic decision 
procedure will always be tricky, but simple (compositional) tactics have to be 
easy to add, and the cost of writing complex tactics must not be due to the 
complexity of the system. To achieve this purpose, we made good use of the 
object oriented features of Objective Caml. 

Inheritance provides an easy way to force all tactics to follow a given tem- 
plate. In our system, tactics have to inherit the abstract class proofTac hence 
they have to implement two methods: generate and refineFun (kept abstract 
in proofTac). The generate method is responsible for checking if the tactic 
applies and for the construction of its premisses, while refineFun is used to 
enforce reliability (more is told in Section 6.3). 

A tactic may be written, either by calling other more primitive tactics, in- 
cluding the basic rules, or from scratch, through direct manipulation on the 
sequent and on the formulas through an API. Inheritance allows to design fami- 
lies of tactics, e.g. all tactics inheriting simplTac try to solve all subgoals before 
submitting them to the user, without requiring further modification of the tactic. 

Composition of tactics allows us to write new ones in a progressive way. 
For example Intro, which does one introduction step, is written using direct 
manipulations on the sequent (but could have been written using left rules of 
free deduction), Intros simply iterates Intro until failure, and Auto tries to 
apply Intros, among other high level tactics. 

Beside these tactics, one find a set of primitive rules which are not intended 
to be user friendly, but rather to provide basic “bricks” to build tactics. 

On top of this, there is a mechanism of dynamic loading for tactics, which 
may be compiled (or even downloaded !) and used in PAF! without restarting it. 

6.3 . . . and Reliable 

Although it is nice to be able to write new tactics, it is even nicer if they do 
not jeopardise the reliability of the system: one may write a tactic, that always 
“prove” its goal (GivenToTheStudentAsAnExercise), but do we want this to be 
possible ? The solution we adopted is to build afterward an atomic proof, which 
contains only primitive rules, out of the proof actually built by the user. 

As in Milner’s LCF [7], tactics are written in two parts, one to build the 
subgoals (generate) and one to ensure safety (refineFun). refineFun takes as 
an argument the primitive proof corresponding to each premise of the tactic, 
then has to send back a primitive proof of its conclusion. As for generate, it 
might be built either from scratch or using more primitive tactics to refine the 
proof progressively. It might as well reuse part of the work made by generate. 
At the end of the process, we get the atomic proof represented using an algebraic 
datatype, which is proof-checked by a specific function. 

This method is very convenient, because it allows us to use different algo- 
rithms for building of proof and for checking it. The building part must be fast, 
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because it is interactive, so we may use decision procedures which are not able 
to build the atomic proof. But because checking is made at the end, it will not 
bother the user if it is not instantaneous to build the atomic proof. 

7 Comparison with (a Few) Other Systems 

It is interesting to compare our system with others used in program verification. 

Our approach is different than Coq’s [18], even used with the Program tactic 
[15]. Coq uses program extraction in the Calculus of Construction (which allows 
only total functions), the program is therefore embedded into the proof, while 
in our system, it is written as it is. The use of Program or Why [6] surely allows 
the user to write an annotated program, and to verify it, but it is done through 
encodings into the Calculus of Constructions, which is in contradiction with our 
WYSIWYP dogma. Another distinction is that we use genuine pattern matching 
and not a case nor a guarded fixpoint operator as in Coq. 

Agda [5] is rather different than our system (except on the interface side, 
Alfa providing full page edition for Agda [8]). One might see Agda as a “super 
ML with dependent types” where the user writes the program together with 
the proof in Type Theory. In our system, proofs are distinct from programs. It 
is legal to write non terminating functions in Agda, and termination may be 
checked using an external criterion. 

We are close to PVS, in spite of some key differences. PVS proves properties 
on terms of an extended lambda-calculus, as we do, but we chose to stick to a 
well known programming language. To handle partial functions, PVS proposes 
subtyping. So all functions are total, but with much control over their domain 
through the use of a comprehension scheme over types. Our types are simpler, but 
we allow partial functions. Besides, PVS does not satisfy De Bruijn’s criterion. 

The system we are the closest of is ACL2 [9]. The goals of this system are 
ours: prove properties directly on the programs, in a functional programming 
language. ACL2 authors chose LISP, while we chose ML. The big difference 
between ACL2 and PAF! is the way proofs are built. When a theorem is stated, 
ACL2 tries to prove it using a very efficient automatic decision procedure. If this 
fails, the usual way is to state other lemmas before, until all the proofs succeed. 
In PAF!, we wanted the proofs to be interactively built by the user. We may add 
that ACL2 does not handle partial functions, nor satisfies De Brnijn’s criterion. 

We are not far from algebraic specification methods: the user of the system 
defines an algebra using ML datatypes. He is then able to state the intended 
properties, using equations, to write the program, and finally to prove that it 
satisfies the intended properties. 



8 Conclusion 

PAF! is presently usable, but is not yet in a release state. In spite of this, and 
from the experience we had using it, we have the feeling that our bet is on the 
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good track: What You See is actually What You Prove, the development of new 
tactics is quite straightforward, and the user interface is promising. 

Further work should include extensions of the system in order to handle more 
features of ML languages (exceptions, etc.). Automation has also to be worked 
out, as there are currently no complex decision procedures. 

Finally, we think that this work opens interesting perspectives in the field 
of program verification, and that our architecture has a future in the world of 
proof assistants. 

Acknowledgements. The author would like to thank Yves Legrandgerard, 
for writing the graphical user interface of the system, and Clrantal Berline, for 
correcting most of the mistakes in this paper (remaining ones were added by the 
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Abstract. Higman’s lemma, a specific instance of Kruskal’s theorem, is 
an interesting result from the area of combinatorics, which has often been 
used as a test case for theorem provers. We present a constructive proof of 
Higman’s lemma in the theorem prover Isabelle, based on a paper proof 
by Coquand and Fridlender. Making use of Isabelle’s newly-introduced 
infrastructure for program extraction, we show how a program can au- 
tomatically be extracted from this proof, and analyze its computational 
behaviour. 



1 Introduction 

Higman’s lemma [8] is an interesting problem from the field of combinatorics. 
It can be considered as a specific instance of Kruskal’s famous tree theorem, 
which is useful for proving the termination of term rewriting systems using so- 
called simplification orders. Higman’s lemma states that every infinite sequence 
of words (wi)o<i<o> contains two words tc, and Wj with i < j such that Wi 
can be embedded into Wj. A sequence with this property is also called good , 
otherwise bad. Although a quite elegant classical proof of this statement has been 
given by Nash-Williams [12] using a so-called minimal bad sequence argument, 
there has been a growing interest in obtaining constructive proofs of Higman’s 
lemma recently. This is due to the additional informative content inherent in 
constructive proofs. For example, a termination proof of a string rewrite system 
based on a constructive proof of Higman’s lemma could be used to obtain upper 
bounds on the length of reduction sequences. 

The first formalization of Higman’s lemma using a theorem prover was done 
by Murtlry [10] in the Nuprl system [5]. Murtlry first formalized Nash-Williams’ 
classical proof, then translated it into a constructive proof using a double nega- 
tion translation followed by Friedman’s A-translation and finally extracted a 
program from the resulting proof. Unfortunately, although correct in principle, 
the program obtained in this way was so huge that it was both incomprehensible 
and impossible to execute within a reasonable amount of time even on the fastest 
computing equipment available. This rather disappointing experience prompted 
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several scientists to think about direct formalizations of constructive proofs of 
Higman’s lemma, notably Murthy and Russell [11], as well as Fridlender [7], who 
formalized Higman’s lemma using the ALF proof editor [9] based on Martin-Lof ’s 
type theory. Fridlender’s paper also gives a detailed account of the history of Hig- 
man’s lemma. Murthy’s classical proof was also reconsidered by Herbelin, who 
formalized an A-translated version of it in the Coq [2] system. Seisenberger’s 
thesis [14,15] contains an excellent overview of various different formalizations 
of Higman’s lemma, most of which have been carried out with the Minlog [3] 
proof assistant. 

A particularly elegant and short constructive proof, based entirely on induc- 
tive definitions, has been suggested by Coquand and Fridlender [6]. The rest of 
this paper is dedicated to a formalization of this proof in the theorem prover 
Isabelle. To improve on previous formalizations, the central parts of the proof 
are formulated using the Isar language for human-readable proofs due to Wenzel 
[17]. Moreover, thanks to the design of Isabelle’s program extraction framework 
[4] , we are also able to derive a correctness statement for the extracted program 
inside the logic. This is in contrast to most other implementations of program 
extraction, whose correctness if often only justified by meta-theoretic arguments 
on paper. Finally, to make the rather abstract exposition given by Coquand and 
Fridlender more easily accessible, we also present an intuitive graphical descrip- 
tion of the computational behaviour of the extracted programs. 

The rest of the paper is structured as follows: In §2, we give some basic 
definitions concerning sequences of words. §3 is concerned with assigning com- 
putational content to proofs involving inductive datatypes and predicates, which 
play a central role in the formalization. §4 describes the actual formalization in 
Isabelle, and §5 is devoted to an analysis of the program extracted from the 
proof. A conclusion is given in §6. 



2 Basic Definitions 

We start with a few basic definitions. Words are modelled as lists of letters from 
the two letter alphabet 1 

datatype letter = A \ B 
types word = letter list 

The empty list is denoted by [], and x # xs is infix notation for Cons x xs. 
We use [x\, . . . , x n ] to abbreviate x\ # ... # []. The embedding relation on 
words is defined inductively as follows: 

consts emb :: ( word x word ) set 

inductive emb 
intros 

embO : [] < bs 

1 It is worth noting that the extension of the proof to an arbitrary finite alphabet is 
not at all trivial. For details, see Seisenberger’s PhD. thesis [15]. 
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embl : as < bs => as < b # bs 
emb2: as < 6s => a # as <i a # bs 

Intuitively, a word as can be embedded into a word bs, if we can obtain as 
by deleting letters from bs. For example, [A, A] < [B, A, B, A], In order to 
formalize the notion of a good sequence, it is useful to define the set L v of all 
lists of words containing a word which can be embedded into v: 

consts L :: word => word list set 

inductive L v 
intros 

LO: w < v ==> w # ws £ L v 
LI: ws £ L v =>■ w # ws £ L v 

A list of words is good if its tail is either good or contains a word which can 
be embedded into the word occurring at the head position of the list: 

consts good :: word list set 

inductive good 

intros 

goodO : ws £ L w =>■ w # ws £ good 
goodl : ws £ good =>■ w # ws £ good 

In contrast to Coquand [6] , who defines Cons such that it appends elements to 
the right of the list, we use the usual definition of Cons , which appends elements 
to the left. Therefore, the predicates on lists of words defined in this section, 
such as the good predicate introduced above work “in the opposite direction”, 
e.g. [[A, A], [A, B ], [5]] £ good , since [B] < [A, B], In order to express the fact 
that every infinite sequence is good, we define a predicate bar as follows: 

consts bar :: word list set 

inductive bar 
intros 

barl : ws £ good => ws £ bar 

bar2\ ( /\w . w # ws £ bar) =» ws £ bar 

Intuitively, ws £ bar means that either the list of words ws is already good, 
or successively adding words will turn it into a good list. Consequently, [] £ 
bar means that every infinite sequence (wj)o<i<w must be good, i.e. have a 
prefix wq ■ • ■ w n with [w n , . . . , u>o] £ good, since by successively adding words 
wo, Wi, ... to the empty list, we must eventually arrive at a list which is 
good. Note that the above definition of bar is closely related to Brouwer’s more 
general principle of bar induction [16, Chapter 4, §8]. Like the accessible part of 
a relation, the definition of bar embodies a kind of well-foundedness principle. 



3 Computational Content of Inductive Datatypes and 
Predicates 

The main proof principles used in the proof of Higman’s lemma are induction on 
datatypes and induction on the derivation of inductive predicates (or sets). In 
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order to extract a program from this proof, it is therefore important to investigate 
which programs correspond to these proof principles. 



3.1 Motivation 

For inductive datatypes, things are rather straightforward: A proof by induction 
on a datatype gives rise to a program defined by recursion on the very same 
datatype. The concept of an inductive predicate is quite similar to that of an 
inductive datatype 2 : While a datatype is characterized by a list of constructors 
together with their types, an inductive predicate is characterized by a list of in- 
troduction rules. Consequently, the program extracted from a proof by induction 
on the derivation of an inductive predicate should be a recursive function, too, 
where the recursion runs over a datatype which encodes the derivation. This 
datatype can be derived from the introduction rules in a canonical way. Each 
introduction rule <p corresponds to a constructor of type tyof tp, where tyof is a 
type extraction function mapping a logical formula to the type of the program ex- 
tracted from its proof. For the fragment of Isabelle comprising implication (=>) 
and universal quantification (/\), which is used to express the introduction rules, 
tyof is defined by 

tyof (f\ x :: a. <p) = a => tyof tp 
tyof (ip =t- tp) = tyof ip => tyof tp 
tyof (P t) = ap 

where tp and ip are computationally relevant formulae, and P is a computation- 
ally relevant predicate variable. Every such predicate variable P is uniquely 
associated with a type variable ap. For a full definition of tyof, the inter- 
ested reader is referred to [4], For an inductive predicate such as bar, we set 
tyof (ws € bar ) = barT, where barT is a new datatype to be defined inductively. 
The correspondence between proof rules for inductive predicates and programs, 
which we have sketched above, is illustrated in Fig. 1. Intuitively, the datatype 
barT representing the computational content of ws £ bar is an infinitely branch- 
ing tree from which one can read off words that, when appended to the sequence 
of words ws, turn it into a good sequence. The branches of this tree are labelled 
with words. For each appended word w, one moves one step closer to the leaves 
of the tree, following the branch labelled with w. When a leaf, i.e. the constructor 
barl is reached, the resulting sequence of words must be good. An example for 
such a tree is shown in Fig. 2. 

In order to reason about the correctness of programs extracted from proofs in- 
volving the bar predicate, we need to describe under what conditions an element 
of barT properly represents a derivation of a formula ws £ bar. This connection 

2 It should be noted that this insight is the essence of expressive type theories based 
on inductive types such as the Calculus of Inductive Constructions [13], where these 
two concepts actually coincide due to the identification of propositions and types. 
However, this is not the case for Isabelle/HOL, which treats propositions and types 
as different concepts. 
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Proof 


Program 


induction on datatype 


recursion on datatype 


list- induct : P [] => 

(/\ x xs. P xs =>■ P (x # xs)) ==> P ys 


list-rec : a list => a p => 

(u ==> a list =r- ap ==> up) => up 


inductive predicate 
introduction rules 


inductive datatype 
constructors 


inductive bar 

barl : f\ uis. ws £ good =>■ ws £ bar 
bar2 : f\ ws. (f\ w. w # ws £ bar ) =*- 
ws £ bar 


datatype barT = 
barl ( word list) 

| bar2 ( word list) ( word =£■ barT) 


induction on derivation 


recursion on datatype 


bar-induct : vs £ bar => 

(/\ ws. ws £ good =>■ P ws) => 

(f\ ws. (f\w. w # ws £ bar) =*- 
(/\ w. P (w # ws)) => P ws) ==> 
P vs 


barT-rec : barT =$■ 

( word list =>■ up) =>■ 

( word list =*• ( word =*• barT) => 
( word =>■ up) => up) => 

Up 



Fig. 1. Computational content of inductive datatypes and predicates 




barl 



bar2 



[A, B] / [A, A] 

bar2 




bar2 



Fig. 2. Computational content of bar 



between datatypes (or programs) and formulae can be captured by the concept 
of modified realizability due to Kleene and Kreisel. We write realizes p ip to mean 
that “program p realizes formula ip” or, more intuitively, “p satisfies specification 
ip”. For the =>//\-fragment of Isabelle, realizes is defined by the equations 

realizes p (/\x. ip) = f\ x. realizes (pi) <p 

realizes p (if => ip) = f\q. realizes q if =$■ realizes ( p q) ip 

realizes p (P i) = P R p t 

where again ip and if are computationally relevant formulae, and P is a computa- 
tionally relevant predicate variable. Each predicate variable P with n arguments 
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is uniquely associated with a predicate variable P R with n + 1 arguments. For 
the inductive predicate bar, we set realizes b ( ws £ bar) = ( b , ws ) £ barR, where 
barR is a new inductive predicate characterized by the introduction rules 

ws € good =>■ ( barl ws, ws) £ barR 

(/\w. (f w, w # ws) £ barR) =$■ ( bar2 ws f, ws) £ barR 

which express that the constructors of the datatype barT realize the introduction 
rules of the predicate bar in the above sense. As a consequence, this means that 
if ( bar2 ws f o, ws) £ barR and 

fo w 0 = bar2 (w 0 # ws) fi, fi wi = bar2 (wi # w 0 # ws) / 2 , . . . , 

fn—i w n - 1 = bar2 (w n -i # • • • # w 0 # ws) f n , 
fn Wn = barl (w n # • • • # Wo # ws) 

then (w n # • • • # w 0 # ws) £ good. Note that this need not necessarily be the 
shortest possible good sequence. The induction principle for the bar predicate, 
which is shown in Fig. 1, is realized by the recursion combinator 

barT-rec f g (barl list) = / list 

barT-rec f g (bar2 list fun) = g list fun (Xx. barT-rec f g ( fun x)) 

The corresponding correctness theorem for this realizer is 
(b, vs) £ barR =$■ 

(/\ws. ws £ good => P R (f ws) ws) ==> 

(/\ws x. (f\w. (x w, w ff ws) £ barR) ==> 

(f\xa. (f\w. P R (xa w) (w # ws)) ==» P R (g ws x xa) ws)) => 

P R (barT-rec f g b) vs 

which is easily proved by induction on the derivation of barR. 



3.2 General Scheme 

We will now generalize what we have just explained by an example. Consider 
the general definition of an inductive set (or predicate) S 

inductive S 

Ii : /\xi ■■■ n- tpl ~ 

In ■ A Wn '■ ■ X n . Tn '' ' ' ' Tnf r (A (J n . 'ijln t n £ S) >' ' ' ' >' 

(A ■ W => fi? € S) ==> Wn £ S 

where (pj are non-recursive premises (also called side conditions), i.e. do not 
contain S. The recursive premises have the form 

A z i :: a l- V’i =>u~i£ S 

where S does not occur in tpj , i.e. the occurrence of the recursive set is only 
strictly positive. 



T? ==> (A 4 :: e s) 

t^ £ S) => Ml £ S 




72 



S. Berghofer 



Induction The rule for induction on the derivation of S has the form 



Xi 



x£S 



P x 



where 

Ti = f\xl- <pj => ■ ■ ■ => <p ■ ' 1 

(A 4 :: 4- 4 => t] £_s) 
(A 4 :: 4- 4 => p 4) = 



=* (A4 
(A 4* ■ 



r,- / r,- 

h • 4 

= 



=*4 

p t’ 






P Ui 



Computational content of derivations The datatype S T representing the com- 
putational content of the derivation of S is defined by 

datatype S T = 

h tT (tyof ipl) ■ ■ • (tyof ipp) (a} =»■ tyof ip\ =£■ S T ) ■ ■ • (oy 1 =>■ tyof pp => S T ) 



I I n T n (tyof (fin) ■ ■ • (tyof (fip 1 ) (of => tyof V»n => 4 ) ' ' ' (ffh" => tyof 1pn n => S T ) 



Realizability predicate The predicate S R , which establishes a connection between 
elements of the datatype S T and propositions of the form x £ S is defined 
inductively by the introduction rules 



4 = /\ p_Pi • ■ ■ Pi* f? 

(A4_4j" ealizes 4 4 =j 

(A 4 4- realizes qp ipp 
(It x~ip\... ff ... fp 



. (realizes pj (pj) => • • • =£■ (realizes pj* pp) 

(fj^^W^s^) =>■■■=> 

=► (/? 4 

I Ui) £ 



4, 4) e s«) 



Computational content of induction principle The above rule for induction on 
the derivation of the set S is realized by the recursion combinator S T -rec for the 
datatype S T , which is characterized by the equations 



S T -recgi ■ ■ ■ g„ (/; pj ■■■pj* f}_- 
(Xzj qj. S T -rec gi---g„ (fj zj 



■Jp) = 9* pt fP? f* •••4 

qj)) ■ ■ ■ (^4 4- S T -rec gi---g„ (fp 



4)) 



The fact that this recursion combinator correctly realizes the principle of induc- 
tion on the derivation d can be expressed by 

(d, x) £ S R => TZi => • • • => Tin => P R ( S T -rec gi ■ ■ ■ g n d) x 



where 



Pi = J\xl_p\ ... pp fj ^ fp 
(AfL^eaNzes qj : y ‘ =; 
(A 4 piP • realizesg[ i _4 i 

(A 4_4^ rea|izes qj 4 — 

(A 4 4- realizes qp tfp 



(realizes pj ipj) 



( fj 4 qj_ 
=►( fP 4 4 
p r (fl zj 4 
p r (fp P 



tj)s 



S R ) 



P (gi Xi Pi ...pp fi ...fp) Ui 



(realizes pj* <pp) 



,JP) £ S R ) 

4 )_=±/ ■ 4 : 

r 4)4) = 



and P R is a new predicate variable uniquely associated with the predicate vari- 
able P in the above induction rule for S. This correctness statement for S T -rec 
can be proved by induction on the derivation of ( d , x ) £ S R . 
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Fig. 3. Minimal bad sequence argument 



4 Formalizing Higman’s Lemma 

Before explaining the actual proof, we will briefly sketch the main idea of Nash- 
Williams’ classical proof 3 , since Coquand’s proof can be considered as a con- 
structive version of it. In order to show that every infinite sequence is good, we 
assume there is a bad sequence and use this to derive a contradiction. If there is 
a bad sequence, we may also construct a bad sequence ( uy)o<i<w which is min- 
imal wrt. word length 4 . Since any infinite sequence containing the empty word 
is necessarily good, each uy must have the form a, # ty. We can find a strictly 
monotone function / and a letter a £ { A , B} such that a/(i) = a for all i. Now 
consider the sequence (^/(i))o<i<w If this sequence was bad, we could construct 
the sequence 

S = Wo ... W f (o) — 1 U/( o) V/(i) . . . 

Because the length of v/( o) is smaller than the length of Wf( o), and (uy)o<i< w is 
minimal, this sequence must be good. For this to be possible, there must be i 
and j with i < /( 0) and uy <juj(.q, because both (uy)o<*<w and (^/(i))o<i<« are 
bad. However, since Uyy,) <1 Wf(j\, this implies that Wi < Wf( 7 ), which contradicts 
the assumption that ( uy)o<i<w is bad. Hence (f/(i))o<i<u> must be good, which 
means that there are i and j with i < j and r'/(q <1 v f(j)i which implies that 
a # v f(i) — a H 1 v f(j ) an< I therefore Wf^ < which again contradicts the 

assumption that (uy)o<i<w is bad. 

To capture the idea underlying the construction of the sequence s shown 
above, we introduce a relation T, where (us, ws) € T a means that vs is ob- 
tained from ws by first copying the prefix of words starting with the letter b, 
where a ^ b, and then appending the tails of words starting with a. This con- 
struction principle is illustrated in Fig. 3, where the shaded parts correspond 
to the sequence s above. In order to define T, we also introduce an auxiliary 
relation R, where (us, ws) £ R a means that ws can be obtained from us by 

3 A more general version of this proof for Kruskal’s theorem can e.g. be found in the 
textbook by Baader and Nipkow [1], 

4 A sequence («y)o<i<w is smaller than a sequence (ui)o <i<u wrt. word length, iff 
there is a k such that Wj = Vj for all j < k and length(wk) < length(vk)- 
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prefixing each word with the letter a. It should be noted that we could as well 
have defined T a as a function which, given a list ws, yields a list vs. However, 
we found the relational formulation more convenient to work with. 

consts R :: letter =>■ ( word list x word list) set 

inductive R a 
intros 

RO: ([], []) 6 R a 

Rl: (vs, ws) £ R a ==> ( w # vs, (a # w) # ws) £ R a 

consts T :: letter => ( word list x word list) set 

inductive T a 
intros 

TO: a ^ b =>■ (vs, ws) £ R b => (w # ws, (a # w) # ws) £ T a 

Tl: (vs, ws) £ T a ==> (v # vs, (a # v) # ws) £ T a 

T2: a ^ b =>■ (vs, ws) £ T a =*• (vs, (b # w) # ws) £ T a 

The proof of Higman’s lemma is divided into several parts, namely propl, 
prop2 and prop3. From the computational point of view, these theorems can 
be thought of as functions transforming trees. Theorem propl states that each 
sequence ending with the empty word satisfies predicate bar, since it can trivially 
be extended to a good sequence by appending any word. This easily follows from 
the introduction rules for bar: 

theorem propl: ([] # ws ) £ bar by rules 

The intuition behind prop2, which is shown in Fig. 4, is a bit harder to grasp. 
Given two trees encoding proofs of xs £ bar and ys £ bar, we produce a new 
tree encoding a proof of zs £ bar by interleaving the two input trees. In order to 
demonstrate that zs £ bar, we need to show that, given a sequence of words, we 
can detect if appending this sequence to zs yields a good sequence. This is done 
by inspecting each word in the sequence to be appended. If the word has the 
form a # w, we move one step ahead in the tree witnessing xs £ bar, whereas 
we move one step ahead in the tree witnessing ys £ bar if it has the form b # w. 
Whenever we reach a leaf in one of these trees, we can be sure that, due to the 
additional constraints on xs, ys and zs, we have turned zs into a good sequence. 
If the word to be appended is just the empty word [], we know by propl that any 
following word will make the sequence good. The proof of prop 2 is by double 
induction on the derivation of xs £ bar and ys £ bar (yielding the induction 
hypotheses I and I'), followed by a case analysis on the word w to be appended 
to the sequence zs. 

Theorem prop3 states that we can turn a proof of xs £ bar into a proof of 
zs £ bar, where zs is the list obtained by prefixing each word in the (nonempty) 
list xs with the letter a. The proof together with its corresponding tree is shown 
in Fig. 5. Note that the subtrees of this tree (reachable via edges labelled with 
words w) are interleavings of other trees formed using prop2. In order to prove 
zs £ bar, we again consider all possible words w to be appended to zs. There are 
essentially two different cases which may occur: 
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theorem prop 2: 

assumes ab: o ^ b and bar: xs £ bar 

shows /\ys zs. ys £ bar =>■ (xs, zs ) £ T a =4- (ys, zs ) £ T b =£■ zs £ bar 
using bar 
proof induct 

fix xs zs assume xs £ good and (xs, zs) £ T a 
show zs £ bar by (rule barl) (rule lemmaS) 

next 

fix xs ys assume I: /\iu ys zs. ys £ bar =*- (w # xs, zs) £ T a => 

(ys, zs) £ T b =>■ zs £ bar 

assume ys £ bar thus /\zs. (xs, zs) £ T a =*• (ys, zs) £ T b => zs £ bar 
proof induct 

fix ys zs assume ys £ good and (ys, zs) £ T b 
show zs £ bar by (rule barl) (rule lemma3) 

next 

fix ys zs assume I': /\w zs. (xs, zs) £ T a =*• (w # ys, zs) £ T b => zs £ bar 
and ys: /\w. w # ys £ bar and Ta: (xs, zs) £ T a and Tb: (ys, zs) £ T b 
show zs £ bar 
proof (rule bar2) 
fix w show w # zs £ bar 
proof (cases w) 

case Nil thus ?thesis by simp (rule propl ) 

next 

case (Cons c cs) from letter- eq- dec show ? thesis 
proof 

assume ca: c — a 

from ab have (a # cs) # zs £ bar by (rules intro: I ys Ta Tb) 
thus fthesis by (simp add: Cons ca) 

next 

assume c ^ a with ab have cb: c = b by (rule letter-neq) 
from ab have (b # cs) # zs £ bar by (rules intro: I' Ta Tb) 
thus fthesis by (simp add: Cons cb) 

qed 

qed 

qed 

qed 

qed 



Fig. 4. Proposition 2 
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theorem prop3: 
assumes bar: xs £ bar 

shows /\zs. xs =£■ [] =£■ (xs, zs) £ R a =>■ zs £ bar using bar 
proof induct 

fix xs zs assume xs £ good and (xs, zs) £ R a 
show zs £ bar by (rule barl) (rule lemma2 ) 
next 
fix xs zs 

assume I: /\w zs. w # xs ^ [] ==> (w # xs, zs) £ R a ==*- zs £ bar 
and xsb: /\w. w # xs £ bar and xsn: xs ^ [] and R: (xs, zs) £ R a 
show zs £ bar 
proof (rule bar2) 

fix w 

show w # zs £ bar 
proof (induct w) 

case Nil 

show ?case by (rule propl ) 

next 

case (Cons c cs) 

from letter-eq-dec show ?case 

proof 

assume c = a 

thus ?thesis by (rules intro : I [simplified] R.) 

next 

from R xsn have T: (xs, zs) £ T a by (rule lemma 4) 
assume c ^ a 

thus fthesis by (rules intro : prop2 Cons xsb xsn R T) 

qed 

qed 

qed 

qed 



Fig. 5. Proposition 3 
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theorem higman : [] £ bar 
proof ( rule bar2) 

fix w 

show [w] £ bar 
proof ( induct w) 
show [[]] £ bar by ( rule propl ) 

next 

fix c cs assume [cs] £ bar 
thus [c # cs] £ bar 
by ( rule propS) (simp, rules) 

qed 

qed 



Fig. 6. Main theorem 



1. If w consists only of b’s, i.e. w = b n for 0 < n, appending words of the form 
b n .. or b m with m < n to the sequence w # zs will lead to a good sequence 
due to propl , whereas appending words of the form b m a.. with m < n will 
lead to a good sequence due to the fact that xs £ bar. The subtrees named 
bar in Fig. 5 correspond to witnesses of this fact. 

2. Similarly, if w contains the letter a, i.e. w = b n a.. with 0 < n, appending 
words of the form b n .. to the sequence w # zs can be shown to lead to a 
good sequence by appealing to the induction hypothesis. Computationally, 
this corresponds to a recursive call in the function producing the tree, which 
is why the corresponding subtrees in Fig. 5 are named prop3. Appending 
words of the form b m or b m a.. with m < n can be shown to lead to a good 
sequence by exactly the same argument as in case 1. 

The proof of propS is by induction on the derivation of xs £ bar, followed by an 
induction on the word w combined with a case analysis on letters. 

We can now put together the pieces and prove the main theorem. In order 
to prove that [] £ bar, it suffices to show that [w] £ bar for any word w. This 
can be proved by induction on w. If w is empty, the claim follows by propl. 
Otherwise, if w = c # cs, we have [cs] £ bar by induction hypothesis, which 
we can turn into a proof of [c # cs] £ bar using prop3. It should be noted that 
structural induction on lists can be viewed as the constructive counterpart of 
the minimality argument used in Nash- Williams’ classical proof. 

The proof, together with a diagram illustrating the intuition behind it, is 
shown in Fig. 6. The shaded parts of the drawing correspond to sequences for 
which we already know that they are good due to the induction hypothesis [cs] 
£ bar. Processing the word w\ in Fig. 6 corresponds to following the branch 
labelled with bba.. in Fig. 5. Processing the word W 2 in Fig. 6, which starts with 
at least as many b’s as the preceeding word 'uq, corresponds to a step in the 
part of the rightmost subtree in Fig. 5, which was produced by a recursive call 
to prop3. In contrast, processing the word W3, which starts with fewer b’s than 
W\, corresponds to a step in the part of the rightmost subtree labelled with bar. 
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5 Analyzing the Computational Content 

The computational content of the theorem shown in Fig. 6 is an infinitely branch- 
ing tree, which is a bit difficult to inspect. Using this theorem, we therefore prove 
an additional statement yielding a program that, given an infinite sequence of 
words, returns a finite prefix of this sequence which is good. Infinite sequences 
are encoded as functions of type nat => a. The fact that a list is a prefix of an 
infinite sequence can be characterized recursively as follows: 

consts is-prefix :: 'a list => ( nat =$■ 'a) =4- bool 

primrec 

is-prefix [] / = True 

is-prefix (x # xs) f = (x = f ( length xs ) A is-prefix xs f) 

We now prove that an infinite sequence / of words has a good prefix vs, 
provided there is a prefix ws with ws € bar. The proof is by induction on the 
derivation of ws € bar. If the derivation tree is a leaf, this means that the current 
prefix is already good and we simply return it, otherwise we move ahead one 
step in the tree and continue the search recursively, i.e. apply the induction 
hypothesis. 

theorem good-prefix-lemma : 
assumes bar : ws € bar 

shows is-prefix ws f =>■ 3 vs. is-prefix vs f A vs G good using bar 
proof induct 
case barl 

thus ?case by rules 

next 

case (bar 2 ws) 

have is-prefix (/ ( length ws) # ws) f by simp 
thus ?case by ( rules intro\ bar2) 

qed 

The fact that any infinite sequence has a good prefix can now be obtained 
as a corollary of this theorem using higman: 

theorem good-prefix: 3 vs. is-prefix vs f A vs £ good 
using higman 

by ( rule good-prefix-lemma) simp+ 

As has already been noted, the function extracted from theorem good-prefix 
need not necessarily find the shortest good prefix. As an example, consider the 
following three functions representing sequences of words: 



i 


0 


1 


2 


3 


4 ... 


/i(*) 


[A, A] 


[B] 


[A, B] 


D 


Q ••• 


h(i) 


[A, A] 


[B] 


[B, A] 


0 


o ••• 


/s(») 


[A, A] 


[B] 


[A, B, A] 


D 


[] ••• 



When applied to /i, good-prefix returns the good prefix [[], [], [A, B ], [ B ], [A, A]], 
which is certainly not the shortest one. The reason for this should become clear 
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letter-eq-dec = 

Xx xa. 

case x of A =>■ case xa of A =$• Left \ B => Right 
| B =*- case xa of A =$• Right \ B =$• Left 

propl = Xx. har2 ([] # x) ( Xw . barl (w # [] # x)) 

propS = 

Xx xa xb xc H Ha. 

barT-rec (Xws x xa H . barl xa) 

( Xws xb r xc xd H . 

barT-rec ( Xws x. barl x) 

(Xws xb ra xc. 
bar2 xc 

(Xw. case w of [] =*- propl xc 
| a # list => 

case letter-eq-dec a x of 

Left =>■ r list ws ((x # list ) # xc) (bar2 ws xb) 

| Right => ra list ((xa if list) ff xc))) 

H xd) 

H xb xc Ha 

prop3 = 

Xx xa H . 

barT-rec (Xws. barl) 

(Xws x r xb. 
bar2 xb 

(list-rec (propl xb) 

(Xa list H . 

case letter-eq-dec a xa of Left =>■ r list ((xa ff list) ff xb) 

| Right => prop2 a xa ws ((a # list) # xb) H (bar2 ws x)))) 

Hx 

higman = bar2 [] (list-rec (propl []) (Xa list. prop3 [a ff list \ a)) 
good-prefix-lemma = Xx. barT-rec (Xws. ws) (Xws xa r. r (x (length ws))) 
good-prefix = Xx. good-prefix-lemma x higman 

Fig. 7. Program extracted from the proof of Higman’s lemma 



when looking at Fig. 6: In order for the algorithm to recognize that the word [B] 
can be embedded into some subsequent word, this word has to start with at least 
one B. However, since the following word starts with an A, the algorithm does 
not recognize that [ B ] can be embedded into it. In contrast, when applied to fi 
and / 3 , good-prefix returns the shortest good prefixes [[ B , A ], [ B ], [ A , A]] and 
[[A, B, A ], [B], [A, A]], as expected. In the case of / 2 , the algorithm recognizes 
that [B] can be embedded into [B, A], since the latter starts with as many H’s as 
the former. In the case of / 3 , the algorithm recognizes that [A] can be embedded 
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into [ B , A], and hence, due to lemma propS, also recognizes that [A, A] can be 
embedded into [A, B , A}. 

The Isabelle/HOL functions extracted from the proof of theorem good-prefix 
are shown in Fig. 7. The corresponding ML code, together with auxiliary func- 
tions, is given in Appendix A. The correctness theorem for good-prefix is 

is-prefix ( good-prefix f) f A good-prefix f € good 

whereas for higman , it is simply ( higman , []) £ barR. The correctness theorems 
for prop2 and prop3 are 

a b =*• 

(/ \x . (x, xs) £ barR =*- 

(/ \xa . (xa, ys) £ barR ==» 

(xs, zs) £ T a ==> (ys, zs) £ T b =>■ (prop2 a b ys zs x xa, zs) £ barR)) 

and 

/\x. (x, xs) £ barR ==>■ xs [] => (xs, zs) £ R a ==> (prop3 zs a x, zs) £ barR 

Note that of the inductive predicates defined in this section, only bar has a 
computational content. If we were not just interested in a good prefix, but also 
in the exact positions of the two words which can be embedded into each other, 
we would also have to assign the predicate good a computational content. 



6 Conclusion 

By formalizing Higman’s lemma, we have demonstrated that Isabelle’s program 
extraction module is capable of handling realistic examples. The formalization 
is rather compact and consists of only 280 lines of Isabelle definitions and proof 
scripts. The automatically extracted program turns out to be quite readable. 
Its ML version is about 70 lines in length (including auxiliary functions), and 
performs reasonably well on medium-size sequences. For example, a sequence of 
350 words with an average length of 20 letters can be processed in 1.44 seconds 
on a Pentium III with 1 GHz. 
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A ML Code Generated from Proof of Higman’s Lemma 



datatype letter = A I B; 

datatype nat = idO | Sue of nat; 

datatype barT = barl of letter list list 

I bar2 of letter list list * (letter list -> barT) ; 

fun barT_rec fl f2 (barl list) = fl list 

I barT_rec fl f2 (bar2 (list, funa)) = f2 list funa (fn x => barT_rec fl f2 (funa x)); 

fun op 43_def0 idO n = n 

I op 43_def0 (Sue m) n = op 43_def0 m (Sue n) ; 

fun size_def3 [] = idO 

I size_def3 (a : : list) = op 43_def0 (size_def3 list) (Sue idO) ; 

fun good_pref ix_lemma x = 

(fn H => barT_rec (fn ws => ws) (fn ws => fn xa => fn r => r (x (size_def3 ws))) H) ; 
fun list_rec fl f2 [] = fl 

I list_rec fl f2 (a :: list) = f2 a list (list_rec fl f2 list); 

datatype sumbool = Left | Right; 

fun letter_eq_dec x = 

(fn xa => 

(case x of A => (case xa of A => Left I B => Right) 

I B => (case xa of A => Right I B => Left))); 

fun propl x = bar2 (([] : : x) , (fn w => barl (w :: ( [] :: x)))); 

fun prop2 x = 

(fn xa => fn xb => fn xc => fn H => fn Ha => 

barT_rec (fn ws => fn x => fn xa => fn H => barl xa) 

(fn ws => fn xb => fn r => fn xc => fn xd => fn H => 
barT_rec (fn ws => fn x => barl x) 

(fn ws => fn xb => fn ra => fn xc => 
bar2 (xc, (fn w => 

(case w of [] => propl xc 
I (xd : : xe) => 

(case letter_eq_dec xd x of 

Left => r xe ws ((x :: xe) :: xc) (bar2 (ws, xb)) 

I Right => ra xe ((xa :: xe) :: xc)))))) 

H xd) 

H xb xc Ha) ; 



fun prop3 x = 

(fn xa => fn H => 

barT_rec (fn ws => fn x => barl x) 

(fn ws => fn x => fn r => fn xb => 
bar2 (xb, (fn w => 

list_rec (propl xb) 

(fn a => fn list => fn H => 

(case letter_eq_dec a xa of Left => r list ((xa :: list) :: 
I Right => prop2 a xa ws ((a :: list) :: xb) H (bar2 (ws, 

w))) 



H x); 



xb) 

x)))) 



val higman : barT = 
bar2 ( [] , (fn w => 

list_rec (propl [] ) 

(fn a => fn list => fn H => prop3 ((a :: list) :: []) a H) w)); 
fun good_prefix x = good_pref ix_lemma x higman; 
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Abstract. This work presents an object-oriented calculus based on higher-order 
mixin construction via mixin composition, where some software engineering re- 
quirements are modelled in a formal setting allowing to prove the absence of 
message-not-understood run-time errors. Mixin composition is shown to be a 
valuable language feature enabling a cleaner object-oriented design and develop- 
ment. In what we believe being quite a general framework, we give directions for 
designing a programming language equipped with higher-order mixins, although 
our study is not based on any already existing object-oriented language. 



1 Introduction 

Recently, mixins are undergoing a renaissance (see, for example, [1,7,8]), due to their 
flexible nature of “incomplete” classes prone to be completed according to the program- 
mer’s needs. Mixins [14,19] are (sub)class definitions parameterized over a superclass 
and were introduced as an alternative to some forms of multiple inheritance [ 13,22], A 
mixin could be seen as a function that, given one class as an argument, produces another 
class, by adding or overriding certain sets of methods. The same mixin can be used to 
produce a variety of classes with the same functionality and behavior, since they all have 
the same sets of methods added and/or redefined. Also, the same mixin can sometimes 
be applied to the same class more than once, thus enabling incremental changes in the 
subclasses. The superclass definition is not needed at the time of writing the mixin defi- 
nition. This minimizes the dependencies between superclass and its subclasses, as well 
as between class implementors and end-users, thus improving modularity. The uniform 
extension and modification of classes is instead absent from the classical class-based 
languages. In this work we extend the core calculus of classes and mixins of [10] with 
higher-order mixins. A mixin can: (i) be applied to a class to create a fully-fledged sub- 
class; or (and this is the novelty with respect to [10]) (ii) be composed with another mixin 
to obtain yet another mixin with more functionalities. In Section 2.1 we present some 

* This work has been partially supported by EU within the FET - Global Computing initiative, 
project MIKADO IST-2001-32222 and project DART IST-2001-33477, and by MIUR projects 
NAPOLI and PROTOCOLLO. The funding bodies are not responsible for any use that might 
be made of the results presented here. 
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uses of mixin inheritance and, in particular, we show that rnixin composition enables a 
cleaner modular object-oriented design. 

This paper presents a framework for the construction of composite mixins, and there- 
fore of sophisticated class hierarchies, while keeping the good features of the original 
core calculus of [10], In particular, we retain structural subtyping. As in most popular 
object-oriented languages, objects in our calculus can only be created by instantiating 
a class. We use structural subtyping to remove the dependency of object users on class 
implementation. Each object has an object type, which lists the names and types of meth- 
ods and fields but does not include information about the class from which the object 
was instantiated. Therefore, objects created from unrelated classes can be substituted 
for each other if their types satisfy the subtyping relation. Structural subtyping was a 
deliberate design decision already in [11,10,24], motivated by the desire to minimize 
code dependencies between object users and class implementors. A different approach 
would be to follow Java or C++, in which an object’s type is related to the class from 
which it was instantiated, and subtyping relations apply only to objects instantiated from 
the same class hierarchy (nominal subtyping). Subtyping is defined on object types only, 
not on class and mixin types, to avoid the well-known inheritance-subtyping conflicts 
(for an account on the subject, see for instance [15]). As a consequence of the absence 
of subtyping on classes, a higher-order mixin is more than a function that consumes 
and produces classes, since such a function cannot accept a class with extra methods 
as an argument. Moreover, the type system would have to express that the result of the 
"mixin-function” has at least the methods of the argument, and such general extensions 
to the type system look unnecessarily complex for the model’s more specific purpose. 

Our design decisions are strongly based on the choices that were made in [10]. Class 
hierarchies in a well-designed object-oriented program must not be fragile: if a superclass 
implementation changes but the specification remains intact, the implementors of the 
subclasses should not have to rewrite subclass implementations. This is only possible 
if object creation is modular. In particular, a subclass implementation should not be 
responsible for initializing inherited fields when a new object is created, since some of 
the inherited fields may be private and thus invisible to the subclass. Also, the definitions 
of inherited fields may change when the class hierarchy changes, making the subclass 
implementation invalid. Unlike many theoretical calculi for object-oriented languages, 
our calculus directly supports modular object construction. The mixin implementor only 
writes the local constructor for his own mixin. Mixin applications and compositions are 
reduced to generator functions that call all constructors in the inheritance chain in the 
correct order, producing a fully initialized object (see Section 3). Unlike some approaches 
to encapsulation in object calculi such as existential types, the levels of encapsulation 
describe visibility 1 , and not merely accessibility. For example, even the names of private 
items are invisible outside the class in which they are defined. This seems to be a better 
approach since no information about data representation is revealed, not even the number 
and names of fields. One of the benefits of using visibility-based encapsulation is that 
no conflicts arise if both the superclass and the subclass declare a private field with the 
same name. Among other advantages, this allows the same mixin to be applied twice 
(see the example in Section 2. 1 ). To ensure that mixin inheritance can be statically type 
checked, the calculus employs constrained parameterization. From each mixin definition 
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const | x \ \x.e \ e\ ei \ fix 


v : : = const | x \ Xx.e \ fix | ref | ! 


| ref | ! | : = | {*j = e j} ie/ | e.x 


1 := I := v | {Xi = ViY eI 


| H h.e | classval(v s , M)\ new e 


| classval(v 9 , M ) 


| mixin 


| mixinval (v m , New, Redef , Expect) 


method m.j = v mj ', (f 6 New ') 




redefine m /- = v mk ; ( kGRede f'> 




expect mp, ^ Ex P ect ) 




constructor v c \ 




end 





| m ixi n va I {v m , New,Redef , Expect ) 

I e±o e 2 | e\ • e-i 

Fig. 1. Syntax of the core calculus: expressions and values. 

the type system infers a constraint specifying to which classes the mixin can be applied 
so that the resulting subclass is type-safe. The constraint includes both positive (which 
methods the class must contain) and negative (which methods the class must not contain) 
information. New and redefined methods are distinguished in the mixin implementation: 
from the implementor’s viewpoint, a new method may have arbitrary behavior, while 
the behavior of a redefined method must be “compatible” with that of the old method it 
replaces. Having this distinction in the syntax of our calculus helps mixin implementors 
avoid unintentional redefinitions of superclass methods and facilitates generation of the 
constraint for mixin’s superclasses and for mixins that participate in mixin composition 
(see Section 4). A marginal difference with respect to the original mixin calculus [10] 
is that we do not treat protected methods, being an orthogonal issue to higher-order 
mixins. Nevertheless, protected methods could be easily accounted for via (structural) 
subtyping as in the original calculus. 

2 Syntax of the Calculus 

The starting point for our calculus is the core calculus of classes and mixins of Bono 
et al. [10] that, in turn, is based on Reference ML of Wright and Felleisen [25]. To 
this imperative calculus of records and functions, we add constructs for manipulating 
classes and mixins. The class and mixin related expressions are: classval, mixin, mixinval, 
o (mixin application), • (mixin composition) and new. The novelties with respect to [10] 
are mixinval and • (mixin composition) to deal with higher-order mixins. 

Expressions and values are given in Figure 1. Most of them are standard, the only 
constructs that might need some explanation are the following: 

- ref, !, := are operators 1 for defining a reference to a value, for de-referencing a 
reference, and for assigning a new value to a reference, respectively. 

- {xi = e , }' el is a record and e.x is the record selection operation (note that this 
corresponds to method selection in our calculus). 

- h is a set of pairs h : : = {(:r,t>)*} where a: is a variable and v is a value (first 
components of the pairs are all distinct). We have a concept of a heap, represented 

1 Introducing ref, !, : = as operators rather than standard forms such as refe, !e, : =eie 2 , simplifies 
the definition of evaluation contexts and proofs of properties. As noted in [25], this is just a 
syntactic convenience, as is the curried version of : =. 
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by h in the expression H h.e, used for evaluating imperative side effects. In the 
expression . . . (x n ,v n ).e, H binds variables xi,...x n invi,...,v n and in 

e. 

- new e uses generator v g of the class value to which e evaluates to create a function 
that returns a new object, as described in Section 3. 

- classval(u g , M) is a class value, and it is the result of mixin application. It is a pair, 
containing the function v g , that is the generator for the class used to generate its 
instance objects, and the set A4 of the indices of all the methods defined in the class. 
In our calculus method names are of the shape m, , where i ranges over an index set, 
and are univocally identified by their index, i.e., m t = rrij if and only if i = j. 

- mixin 

method rrij = v m .\ 0^ New ) 
redefine rrik = v mk ; ( k ^ Rede f) 
expect m*; ( ieEx P ect ) 

constructor v c \ 
end 

is a mixin expression, and it states the methods that are new, redefined, and expected 
in the mixin (names of which have to be all distinct). More precisely, rrij = v mj are 
definitions of the new methods, Wfc = v rn , are method redefinitions that will replace 
the methods with the same name in the superclass, and ?n,; are method (names) that 
the superclass is expected to implement. Each method body v m . (respectively, v r „ k ) 
is a function of the private field and of self, which will be bound to the newly created 
object at instantiation time. In method redefinitions, v mk is also a function of next, 
which will be bound to the corresponding old method from the superclass. The v c 
value in the constructor clause is a function that returns a record of two components: 
the fieldinit value is used to initialize the private field; the superinit value is passed 
as an argument to the superclass constructor. When evaluating a mixin, v c is used 
to build the generator as described in Section 3. 

- mixinval(u m , New, Redef, Expect) is a mixin value, and it is the result of a mixin 
evaluation. It is a tuple, containing one function and three sets of indices. The 
function v m is the (partial) generator for the corresponding mixin. The sets New, 
Redef, and Expect contain the names of all methods defined in the mixin (new, 
redefined, and expected). 

- ei o e 2 denotes the application of mixin value e\ to class value c-i- Given the (su- 
perclass value C '2 as an “argument” to e\, it produces a new (sub)class value, 

- ei • e 2 is a composition of two mixin values e-\ and e 2 - It produces a new mixin 
value taking components from both e\ and e 2 - The resulting mixin can be applied 
to class values to produce new classes, as well as composed with other mixin values 
to produce new composite mixins. 

As in [10], we define the root of the class hierarchy, class Object, as a predefined 

class value: Object = classval( A A {}, [ ] ). The root class is necessary so that all 

other classes can be treated uniformly and it is the only class value that is not obtained 
as a result of mixin application. The calculus can then be simplified by assuming that 
any user-defined class that does not need a superclass is obtained by applying a mixin 
containing all of the class method definitions to Object. For the sake of clarity, in the 
following examples we will avoid the explicit mixin application to Object. 
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2.1 An Example of Mixin Inheritance 

In this section, we present a simple example that shows how mixins can be implemented 
and used in our calculus and explain some of the uses of mixin application and mixin 
composition. For readability, the example uses functions with multiple arguments even 
though they are not formalized explicitly in the calculus. 

In the following, we give the definitions of Encrypted mixin and Compress mixin 
that implement encryption and compression functionality on top of any stream class, 
respectively. Note that the class to which the mixin is applied may have more methods 
than expected by the mixin. For example, Encrypted can be applied to Socket o 
Object, even though Socket o Object has other methods besides read and write. The 
mixin Random allows random access to any stream class, thus we can build a random 
access file class with the mixin application Random o FileStream. 



let FileStream = mixin 
method write = . . . 
method read = . . . 
end in 



let Socket = mixin 
method write = . . . 
method read = . . . 
method IPaddress = . . . 
end in 



let Random = mixin 
method lseek = . . 
expect write; 
expect read; 
end in 



let Encrypted = 
mixin 

redefine write = A key. A self. Xnext. A data, next (encryptfdata, key)); 
redefine read = A key. A self. Xnext. A_ . decrypt (next (), key); 
constructor A (key, arg). {fieldinit=key, superinit=arg}; 
end in 



let Compress = 
mixin 

redefine write = A level. A self. Xnext. A data, next (compress(data, level)); 
redefine read = A level. A self. Xnext. A_ . uncompressfneYf (), level); 
constructor A (level, arg). {field i n it=level, superinit=arg}; 
end in . . . 



From the definition of Encrypted, the type system infers the types of the methods that 
the mixin wants to redefine. These are the constraints that must be satisfied by any class 
to which Encrypted is applied. The class must contain write and read methods whose 
types must be supertypes of those given to write and read, respectively, in the definition 
of Encrypted. In Random such methods are declared as expected and they are used 
within the method lseek. Once again the type system infers their types according to how 
they are used in lseek. 

To create an encrypted stream class, one must apply the Encrypted mixin to an 
existing stream class. For example, Encrypted o FileStream is an encrypted file class. 
The power of mixins can be seen when we apply Encrypted to a family of different 
streams. For example, we can construct Encrypted o Socket, which is a class that 
encrypts data communicated over a network. In addition to single inheritance, we can 
express many uses of multiple inheritance by applying more than one mixin to a class. 
For example, PGPSignoUUEncode o Encrypted o Compress o FileStream produces 
a class of files that are compressed, then encrypted, then uuencoded, then signed. In 
addition, mixins can be used for forms of inheritance that are not possible in most single 
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and multiple inheritance-based systems. In the above example, the result of applying 
Encrypted to a stream satisfies the constraint required by Encrypted itself, therefore, 
we can apply Encrypted more than once: Encrypted o Encrypted o FileStream is a 
class of files that are encrypted twice. In our calculus, class private fields do not conflict 
even if they have the same name, so each application of Encrypted can have its own 
encryption key. 

Mixin composition further enhances the (re)usability of classes and mixins and 
enables better modular programming design, by exploiting software composition at 
a higher level. For example, the programmer is able to build a customized library of 
reusable mixins starting from existing mixins: one can create the new mixin 2Encrypt 
= Encrypted • Encrypted, instead of always applying the mixin Encrypted twice 
to every stream class in her program. This also enables consistency: if in the future the 
definition of the mixin 2Encrypt must be extended, e.g., by also exploiting UU encoding, 
then by changing only the definition of 2Encrypt, with an additional mixin composition, 
it is guaranteed that all the functions that used 2Encrypt will use the new version. 
Moreover, construction of mixins can be delegated to different parts of the program 
(thus exploiting modular programming), and the resulting mixins can then be assembled 
in order to build a class. For instance, the following code delegates the construction of 
mixins for encryption and compression to two functions, and then assembles the returned 
mixins for later use: 

let mi = build_compression() in let m2 = build_encryption() in 
letm = mi»m-2 in (new(moFileStream)).write("foo") 

The function build_compression returns a specific mixin according to user’s requests: 
it can return a simple Compress mixin, or a more elaborate UUEncode • Compress 
mixin. Similarly, build_encryption, instead of simply returning a mixin Encrypted, 
returns the composition PGPSign • Encrypted. All these enhanced modular composi- 
tion functionalities, supported by mixin composition, would not be directly provided by 
simple mixin application. 

Finally, let us observe that streams are implemented usually via the design pattern 
decorator [21] (for instance, in Java), and this requires additional manual programming. 
Instead, with mixins (and in particular with mixin composition), streams can be pro- 
grammed directly exploiting language features. This is just one of the examples of the 
additional expressiveness provided by mixin composition. 

3 Operational Semantics 

The operational semantics of the original calculus [10] is very close to an implementa- 
tion, and we follow the same approach. Our operational semantics is a set of rewriting 
rules including the standard rules for a lambda calculus with stores (in our case the 
Reference ML [25]), and some rules that evaluate the object-oriented related forms to 
records and functions, following the “objects-as-records” technique and Cook’s “class- 
as-generator-of-objects” principle. This operational semantics can be seen also as some- 
thing extremely close to a denotational description for objects, classes, and mixins, and 
this “identification” of implementation and semantical denotation is, according to us, a 
good by-product of our approach. 
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const v —t 5(const,v) (<5) refv — » V\(x,v).x (ref) 

if S(const,v) is defined — > U(x,v)h.R[v] (deref) 

(\x.e) v — » [v/x\ e (f3 v ) Y\(x,v)h..R[.=xv') — » H(x,v')h.R[v'] (assign) 

fix (Xx.e) — » \fix(\x.e)/x\e (fix) _R[H h.e] — > H h.R[e], Rf [] (lift) 

{...,* = «4sC. .}.* — ► v (select) H/i.Hli .e-tHli/i , .e (merge) 



/ mixin \ 

method rrij = Vm j ; 
redefine m k = v mk ; 
expect mp, 
constructor c; 

y end J 

Gen m = Xx. 



j € New 
k (E Redef 
i € Expect 



— J- mixinval (Genm , New, Redef , Expect) 



let t = c(x) in 

' gen = A self. 

( rrij = \y.Vmj f.fieldinit self y i^ New | 

\ m k = \y.Vm k f.fieldinit self y keRede f J ’ 

superinit = f.superinit 



(mixval) 



m\x\nva\ (Gen m , New , Redef , Expect) oclassval(g, M) — > classval(Gcn,AfcM’U M) 

Gen = A®. A self. 

let mixinrec = Gen m {x) in 

let mixingen = mixinrec. gen in 

let supergen = g(mixinrec. superinit) in 

! m,j = \y. (mixingen self).mj y i^ New "j 

m k = \y. (mixingen self). m k (supergen self). m k y keRede f S 
rrii = Xy. (supergen self). m-i y l ^Ni-Redef J 

Fig. 2. Reduction rules 



(mixapp) 



R : : = [ ] | R e \ v R \ R.x \ new R\Roe\voR\R»e\vR 

| {mi = vi, . . . = R,m i+1 = e i+1 ,...,m n = e n } 1 - l ~ n 

Fig. 3. Reduction contexts 



The operational semantics extends the one of the core calculus of classes and mixins, 
[10], and therefore exploits the Reference ML of Wright and Felleisen treatment of side- 
effects [25]. We give the reduction rules in Figures 2 and 4. To abstract from a precise set 
of constants, we only assume the existence of a partial function 5 : Const x ClosedVal — 1 
ClosedVcil that interprets the application of functional constants to closed values and 
yields closed values. In Figure 2, R are the reduction contexts [23,17,18]. Reduction 
contexts are necessary to provide a minimal relative linear order among the creation, 
dereferencing and updating of heap locations, since side effects need to be evaluated in 
a deterministic order. Their definition can be found in Figure 3. We assume the reader 
is familiar with the treatment of imperative side-effects via reduction contexts and we 
refer to [25,10] for a description of the related rules. 

(new) rule is responsible for instantiating new objects from class definitions. The 
resulting function can be thought of as the composition of two functions: fixo g. First, 
the generator g is applied to an argument v, thus creating a function from self to a record 
of methods. Afterwards, the fixed-point operator fix is applied to bind self in method 
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bodies and create a recursive record (following [16]). The resulting record is a fully 
formed object that could be returned to the user. 

Rule ( mixval ) turns a mixin expression into a mixin value. A mixin value consists 
of a mixin generator Gen m and of the sets of mixin method names (new, redefined, and 
expected; we recall that names are identified with their indices, as said in Section 2). 
Gen m is a sort of a compiled (equivalent) version of the mixin expression. Given the 
parameter for the mixin constructor c, Gen m returns a record containing a (partial) object 
generator gen, and the argument superinit for the (future) superclass constructor. We 
recall that c is a function of one argument which returns a record of two components: one 
is the initialization expression for the method field (fieldinit), the other is the superclass 
generator’s argument (superinit). The object generator gen binds the private field of 
the methods defined (New) and redefined ( Redef ) by the mixin to fieldinit (recall that 
method bodies take parameters fox field, for self, and, if the method is a redefinition, also 
for next, which will be bound to the corresponding superclass method). The returned 
object generator is partial because it comes from a mixin, i.e., the expected methods 
and the next for each redefined method will be provided by a superclass or by other 
mixins (in fact, note that next is not yet bound in ink’s bodies). Notice that all the other 
mixin operations, i.e., mixin application and mixin composition, are performed on mixin 
values. In the original calculus of [10], mixin values are created and “blended” directly at 
mixin-application time with a (super)class value to obtain a (sub)class value. Here mixin 
values are made explicit to deal smoothly with mixin composition. For all the methods, 
the method bodies are wrapped inside A y. ■ ■ - y to delay evaluation in our call-by-value 
calculus. 

Rule (mixapp) evaluates the application of a mixin value to a class value, performing 
mixin-based inheritance. A mixin value mixinval (Gen m , New, Redef .Expect) is applied 
to a class value classval(</, M) which plays the role of the superclass, where g is the 
object generator of the superclass and A4 is the set of all method names defined in 
the superclass. The resulting class value is classval (Gen, New U A4), where Gen is the 
generator function for the subclass, and New UA4 lists all its method names. Using a 
class generator delays full inheritance resolution until object instantiation time when 
self becomes available. The generator Gen takes a single argument x, which is used 
by the mixin generator, and returns a function from self to a record of methods. When 
the fixed-point operator is applied to the function returned by the generator, it produces 
a recursive record of methods representing a new object (see rule (new)). Gen first 
calls Gen m (x) to compute the mixin object generator mixingen, a function from self 
to a record of mixin methods, and the parameter mixingen. superinit to be passed to the 
superclass generator g, that, in turn, returns a function supergen from self to a record of 
superclass methods. Gen results to be a function of self that returns a record containing 
all the methods — from both the mixin and the superclass. All methods of the superclass 
that are not redefined by the mixin, ?n,; where i jV{ Redef, are inherited by the 
subclass: they are taken intact from the superclass’s “object” (supergen self). These 
methods rn, include all the methods that are expected by the mixin (this is ensured by 
the type system, see Section 4). Methods mj defined by the mixin are taken intact from 
the mixin’s “object” (mixingen self). As for redefined methods mu, next is bound to 
(supergen self).mk in Gen. Notice that, at this stage, all methods have already received 
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mixinval{(/i , New\ , Redef 1; Expect j) • m\x\nva\{g2, New 2 , Redef 2 ,Expect 2 ) — » 
mixinval(Ge/i, New\ UNew2, {Redef 1 U Redef 2 ) — NeW2, 

( Expect l — (New2 U Redef 2)) U ( Expect 2 — Redef 3 )) 

Gcu = A;/-. 

let leftrec = 31 (a; ) in 
let rightrec = ^(L/b'ec. superinit) in 
let left gen = leftrec .gen in 
let rightgen = rightrec . gen in 
' gen = Are//. 

m ti = Xyfleftgen self).rrij 1 y n eNew i 
m h = At/ .{rightgen self). mj 2 y fa ENew 2 - Redef 3 
m,j 3 = Xyfleftgen self).m.j 3 {rightgen setf).rrij 3 y h& Re def 1 r\New 2 
' m kl = Xyfleftgen self).m kl y ^Rede fl -(N m ^Redef 2 ) 

m k2 = \next.(leftgen self).m k2 {{rightgen self). m k2 next) k2E R edef 1 r\Redef 2 
m k3 = Xy. {rightgen self).m k3 y k 3 eRede ^ ~ Rede ^ 

k superinit = rightrec . superinit 

Fig. 4. Reduction rule (mixcomp) for mixin composition 



a binding for the private field. The variable self is passed all along in all method forms, 
in such a way that the host object will be bound appropriately at object creation time. 

Rule ( mixcomp ) (Fig. 4) composes two mixins to produce a new mixin. The two 
mixins may partially complete each others’ definitions, providing (some of) the miss- 
ing components. Let us denote the mixin composition by ei • e2 and the resulting 
mixin by e. When composing two mixins, it is necessary to determine which sets of 
new/redefined/expected methods the new mixin e will have. Our design decision is as 
follows: the mixin e2 acts as a “superclass” for ei (mirroring mixin application order), 
and, in particular, some of ei methods may override some of e2 methods. Therefore, all 
the new methods of the mixin ei {New\) are inserted in the resulting mixin e, while only 
the new methods of e2 that are not redefined by e\ (j'2 G Ne\V2 — Redef f) become part of 
the new mixin. Notice that the type rule for mixin composition {mixin comp ) (Figure 6) 
must check that no name clashes between new methods of e\ and any method of e2 take 
place. This decision is in line with a good object-oriented design principle of not con- 
fusing method redefinitions and name clashes. Therefore, an error is signaled at compile 
time and not at runtime. As far as redefined methods are concerned, the situation is more 
complex: the methods specified as redefining in e\ can override some new methods of 
e2, some redefining methods of e^, and (even if only virtually) some of the expected 
methods of e2. 



- If a method m J3 in ei redefines a method defined in e2 (j'3 G Redef 1 fl/Ven^), then 
the overriding is completed and rrij 3 becomes a new method in the resulting mixin 
e, after binding its next to e2’s implementation of m 33 ; 

- If ei redefines a method rrik 2 that, in turn, is redefined by e2 (fc'2 G Redef 1 D Redef 2 ), 
then this method is still a redefined method in e. Since e\ “overrides” e2, therefore 
mfc 2 ’s implementation of e\ redefines that of e^, the next in the implementation of e\ 
is bound to the implementation of ei, and the next in the implementation of e2 is not 
bound, since it will be bound during future mixin composition or mixin application. 
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This means that the redefinition of a method mk 2 by means of e-i is delayed (while 
ei has already performed its “internal” redefinition of m;, 2 over e2); 

- If ei redefines a method that is expected in e2, then this method will become a 
redefined method in e, so it will not appear among the expected methods of e, but it 
will be a method that e is willing to redefine. 

Apart from the above examined methods, method redefinitions that are still present as 
method redefinitions in the resulting mixin e are: (i) the redefining ones from e 2 that are 
not redefined by e± (&3 G Redef 2 — Redef\ ); (ii) the ones from e\ that are not defined 
in ei and hence not “overriding” anything yet (£7 G Reclef l — ( Newi U Redeffj). 

Finally, new and redefined methods from e2 can provide some of the definitions that 
the mixin ei expects; in that case, such methods expected by e\ do not appear anymore 
in the expected method set of e. 

The generator of the new mixin is a combination of the generators of ei and e2. 
Since e\ is considered to be the “subclass”, the parameter x is passed to <j[ , and gi 
receives as a parameter the superinit returned by gi (a:); the superinit field of the record 
returned by the generator of the new mixin is set to 52 (<7i(®) -superinit). superinit. This 
strategy for building the new mixin generator corresponds to serializing the call of the 
two constructors similarly to what happens in standard object-oriented languages. Notice 
that this is consistent with the type mixin(7b 2 ,7 d.!, £new, ^red, ZJexp, ^old) assigned to 
the new mixin by the type rule (mixin comp ) (Figure 6). 

4 Type System 

In addition to functional, record, and reference types of Reference ML type system, our 
type system has class-types and mixin-types. 

The types in our system are the following: 

t :r== l | n -)■ t 2 | t ref | {m* | class(T, 2 : 6 ) | mixin (n, r 2 , £ new , £ re d, ^exp, Kid) 

where 1 is a constant type, — > is the functional type operator, r ref is the type of locations 
containing a value of type r. The other type forms are described below. 

£ (possibly with a subscript) denotes a record type of the form {to,; : The 

set of indexes I (where I C N) is often omitted when it is not relevant. A record type 
can be viewed as a set of pairs label'.type where labels are pairwise disjoint ( £-\ and 
£1 are considered equal, denoted by £1 = £1, if they differ only in the order of their 
elements). Notations and operations on sets are easily extended to record types as in the 
following definitions: 

- if TOj : T mi G £ we say that the subject occurs in £ (with type T m ). Subj(£) 
denotes the set of all subjects occurring in £; 

- £\ U £1 is the standard set union (used only on £ 1 and £ 2 such that Subj(£f) D 
Subj(£ 2 ) = 0 , in order to guarantee that £\ U £ 2 is a record type); 

- £\~ £ 2 is the standard set difference; 

- £\/£ 2 = {nii : r m< | mi : r mi G £\ A to i occurs in £ 2 }. 

The definitions of typing environments r and of typing judgments are standard. Our type 
system supports structural subtyping ( <: relation) along with a subsumption rule (sub). 
The subtyping rules are shown in Appendix A. Since subtyping on references is unsound 
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and we wish to keep subtyping and inheritance completely separate, we have only the 
basic subtyping rules for function and record types. Subtyping only exists at the object 
level, and is not supported for class or mixin types (as explained in the introduction). 

In the class type class( 7 , Eb), 7 is the type of the generator’s argument and Eb = 
{nii : r m . } is a record type representing self. 

In the mixin type mixin ( 7 b, 7 d , E new . E red , E exp , E old ) 

- jb is the expected argument type of the superclass generator, 

- is the exact argument type of the mixin generator, 

- E new = {irij : Tmj } are the exact types of the new methods introduced by the mixin, 

- S rec i = {irik '■ Tm k } are the exact types of the methods redefined by the mixin, 

- 11^ p = {nrii'. Tmi } are the types of the methods that are neither defined nor redefined 
by the mixin, but expected to be supported by a superclass which the mixin will be 
applied to, or by another mixin which the mixin will be composed with, 

- £ 0 / d = {ink ■ Tm k } are the types assumed for the old bodies of the methods redefined 
by the mixin. 

We report in Figure 5 the typing rules regarding classes and mixins (the rest of the 
typing rules are given in Appendix A). Some of them are syntactic variations of those 
presented in [ 10] and we refer the reader to that paper for comments about such rules. We 
only comment upon the rules related to mixin forms. The rules (mixin) and (mixin vai) 
assign the same type to their respective expressions, although deduced in a different way. 
In the rule (mixin) the side condition Subj(E llew )r\Subj(E re d)nSubj(E exp ) = 0 ensures 
that the names of new, redefined, and expected methods are all distinct. In the rule ( mixin 
app), £b contains the type signatures of all methods supported by the superclass to which 
the mixin is applied, and Eb/ E red are the superclass methods redefined by the mixin 
(the superclass may have more methods than those required by the mixin constraints). 
The premises of the rule (mixin app ) are the following: 

i) £b <’■ (E exp U £ 0 ld) requires the actual types of the superclass methods to be subtypes 
of those expected by the mixin. 

ii) E re d <:£b/ Ered requires that the types of the actual implementations of methods in 
the superclass (which may belong to a subtype of the £ 0 ij, from the above constraint) 
are supertypes of the ones redefined in the mixin. Thus, the types of the methods 
redefined by the mixin (E red ) will be subtypes of the superclass methods with the 
same name. 

iii) Subj(£b) H Subj(E new ) = 0 guarantees that no name clash will take place during 
the mixin application. 

Intuitively, the above constraints insure that all the actual method bodies of the newly cre- 
ated subclass are at least as “good” as expected. The resulting class, of type classed, Ed), 
contains the signatures of all the methods forming the new class, created as the re- 
sult of mixin application. E m \ and E new are methods defined by the mixin, whereas 
Eb — ( Eb/E re d ) are the methods inherited directly from the superclass. Let us observe 
that, for any well typed mixin, Subj(E re d) = Subj(E 0 id), therefore for any record type 
E, E / E re d = E / E 0 id- 

Now we concentrate on the main topic of the paper, the rule for mixin composition 
(mixin comp) given in Figure 6 . Since ei acts as the “superclass” of ei, e\ will pass the 
argument of type 7 b 1 to the constructor of the superclass & 2 , that expects an argument of 
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r I- g:y-> {»!,■ :z nti } i€tM -> {nij :x mi } i€fM 

( class val) 

r F classval(g, tM ) : class(y, {nij :t 



T F e:class(y, {m, :x m , }) 

(instantiate) 

r I- new e : y — > {m, : T,„ f } 



(New) For j £ New: T F v mj : T| — > Z — > x l nlj 
(Redef) For k £ Redef: T F v mj : r) — > Z — » T, r „ t — > 

(Constr) r h c : y^ — » {fieldinit : T| , superinit : yy} 

Subj(Z new ) n Subj(L red ) n Subj(T.ex P ) = 0 
(mixin) 



r f 



[ mixin 

method = v m .; 
redefine m y = v„, k ; 
expect m,-; 
constructor c; 
y end 



j € New 
\ fc € Redef 
^ i € Expect 



: mixin(y/,,yj, E nevv , E^ , Z eA ^ , X Y y/ ) 



r F g:y d -> {gen :Z— » {my : xi, . , my : superinit :y fc } 
T F mixinval(g,News«fde/,&pecf) : mixin(y fc ,y rf ,E„ cw ,i: re j,E £l y,E oW ) 



(mixin val ) 



X — Y new U J U ^exp 
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Fig. 5. Typing rules for class and mixin-related forms 

type 7d 2 for its constructor. Therefore, we require that 7^ <:7d 2 (condition (ci)). The 
mixin e 2 is allowed to redefine methods: defined by e 2 , expected by e 2 , or redefined by 
e 2 . In all cases we must check that the redefinition (and the expectation about the old 
method in the superclass) is type safe (conditions (c 2 ), (03) and (04)). If ei redefines 
a method my that is in turn redefined by e 2 , then we will put the redefined type of 
my from e 2 in E red and the old one from e 2 in S 0 id- This is consistent with the view 
that the new mixin will contain rnp with the body from ei (with its next bound to e 2 ’s 
implementation, while in trip ’s body from e 2 next remains still unbound, as the method 
nip can be further redefined, see Section 3 ). If ei redefines, instead, an expected method 
of e 2 , that method will not appear in S exp , but the redefined type and the old type, as 
inferred from e 2 , will appear in S red and S 0 i d , respectively. Conditions (C5) and (eg) 
check whether e 2 can provide methods (either defined or redefined) that are expected 
by ei. If such a method is provided, then it will not appear in S exp . In case both ei 
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Fig. 6. Typing rule for mixin composition 

and e 2 expect the same method, the types with which such method is expected must 
be comparable (condition (C7)); the method will then appear in E exp with the smaller 
type. Finally, condition ( c 8 j checks that no name clash occurs among methods defined 
by ei and those defined/redefined/expected by e 2 . This decision is in line with a good 
object-oriented design principle of not confusing method redefinitions and name clashes. 

Our system is proved sound, in the sense that “every well-typed program cannot 
go wrong”, which implies the absence of messcige-not-understood runtime errors. We 
consider programs, which are closed terms, and we introduce faM/ty programs, which are 
a way to approximate the concept of reaching a “stuck state” during the evaluation; for 
example, a program “reaches a stuck state” if a method call is attempted on an expression 
that does not evaluate to an object. We prove that if the evaluation for a program p does 
not diverge, then either p returns a value, or p reduces to a faulty program. We then show 
that faulty programs are not typable, and, via a subject reduction property, we establish 
that if a program is typable, then it evaluates to a value, under the condition that the 
program does not diverge. 

Lemma 1 (Subject Reduction). If T h e : r and e evaluates to e! , then r h e! : r. 

Theorem 1 (Soundness). Let p be a program: if e\- p: t then either the evaluation 
for p diverges, or p evaluates to a value v and eh v.t (e stands for the empty typing 
environment). 

The metatheory for the present system, and in particular the subject reduction prop- 
erty, are extensions of the ones in [9] (Chapter 9). The formal definitions and properties 
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were analyzed in detail, and can be found at: 

http : //www . dsi . unif i . it /~bett ini/high-proofs .pdf. 

5 Conclusions 

This paper presents a calculus supporting class hierarchies creation via mixin application 
(already present in [10]) and mixin composition. Our goal was to design a clean and 
general form of mixin composition without committing ourselves to an already existing 
language. We chose to extend the calculus [10] because: (i) it is an easy-to-extend 
framework; (ii) its operational semantics is close both to an implementation and to a 
denotational model. Therefore, being able to produce something towards a denotational 
model for mixins is, in our opinion, a good by-product; (iii) it allowed us to choose 
structural subtyping (as opposed to nominal subtyping of C++ and Java), since, according 
to Bracha et al., [5], “When subtyping is structural, mixins do not introduce any new 
issues with respect to subtyping.” Moreover, structural subtyping has the advantage of 
being independent from the class hierarchy. 

In the literature, there are many proposals that deal with mixins. We mention here 
some of them, the most interesting with respect to our calculus. Bracha and Cook extend 
Modula-3 with mixins in [14] (this is one of the seminal papers on mixins). The novelty 
is in seeing object types as mixins, which either explicitly state the modifications to the 
superclass, or are obtained as a result of mixin composition. The left-hand mixin has a 
“priority” and the composition is not explicitly written in order to ensure upward compat- 
ibility with the existing language. Instead, we think that making the composition explicit 
(as it is in our calculus) makes the programmer aware of how software components are 
composed, thus providing more control over the behavior of the program. 

Flatt et al. [20] extend a subset of sequential Java called ClassicJava with mixins 
and call it MixedJava. Mixins use their inheritance interface to specify how the inher- 
ited methods are extended and/or overridden. Existing mixins can be combined in order 
to produce new composite mixins. As in our calculus, the left-hand mixin has the “prece- 
dence” over the right-hand mixin. Composition is well-defined only if the right-hand 
mixin implements the left-hand mixin inheritance interface (i.e.. the right-hand mixin is 
required to provide all the methods expected by the left-hand one). In this respect, our 
approach is more oriented to code composition, in that the new composite mixin is still 
allowed to have expected methods not yet resolved. The duplication of method names 
in MixedJava is resolved at run-time with the run-time context information provided 
by the current view of the object (represented as a chain of mixins). 

Ancona and Zucca [2,3,4] give a formal model for mixin modules. A mixin is a func- 
tion from input to output components, and they characterize axiomatically the operators 
for composing mixins in order to obtain higher-order mixins. They also present a variety 
of method renaming forms, to deal with different typologies of name collisions. In [1] 
they present Jam, an extension of Java supporting mixins, but not mixin composition, 
where name collision is treated essentially as “accidental override”. 

Our approach is different from the ones of MixedJava and Jam in some respects. 
Besides not being a Java-like calculus, which allows us to use structural subtyping, our 
calculus has a more modular class constructor. Moreover, method names collisions are 
resolved statically by the type system. If this approach may look more restrictive than 
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the ones of MixedJava and Jam, we preferred it because it forces the programmer to be 
aware of collisions and to resolve them, while automatic handling of such ambiguities 
may lead to unexpected behavior at run-time. 

Boudol [12] extends Reference ML [25] with records and let rec operator. This 
enriched ML leads to a theoretically solid treatment of mixins, which are seen as class 
transformers. The principal difference between the two calculi seems to be in the way 
references to fields are created. In our calculus these are created at class creation time, 
when mixin application is evaluated, whereas in the calculus of Boudol they are created 
at class instantiation time, i.e., when an object is created. 

A future research direction is an extension of this calculus where not only classes 
can be instantiated but also mixins, obtaining a form of incomplete objects, to be 
completed in an object-based fashion. A first version of the incomplete objects is given 
in [6]. Moreover, higher-order mixins seem to be a natural feature to be added to MoMi 
[7], a coordination language where object-oriented mobile code is exchanged among 
the nodes of a network. 

Acknowledgment. We would like to thank the anonymous referees for their comments 
and suggestions, which helped us in giving a better focus to the paper and in improving 
the overall presentation. 
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Abstract. We address the issue of the decidability of the type inference problem 
for a type system of an object-oriented calculus with general selftypes. The frag- 
ment considered in the present paper is obtained by restricting the set of operators 
to method invocation only. The resulting system, despite its syntactical simplicity, 
is sufficiently complicated to merit the study of the intricate constraints emerging 
in the process of type reconstruction, and it can be considered as the core system 
with respect to typability for extensions with other operators. The main result of 
the paper is the decidability of type reconstruction, together with a certain form 
of a principal type property. 



1 Introduction 

Object-oriented programming languages enjoy an ever growing popularity, as they are a 
tool for designing maintainable and expandable code, and are also suited for developing 
mobile code and web applications. Imposing a type discipline on programs ensures safety 
(i.e., the absence of message-not-understood run-time errors), yet this type discipline 
must be flexible enough in order not to restrain reusability. Polymorphic type systems 
are one answer to this double requirement; see for example [6,13]. Among the many 
features that can be included in such systems, there is the use of selftype. 

The concept of self (sometimes called this ) is of paramount importance in object- 
oriented languages. Self is a special variable that allows reference to the object executing 
the current method, and hence access to its fields and invocation of the sibling meth- 
ods. This concept, while being a very convenient feature, influences substantially the 
problem of static typing for object-oriented languages. Self types have been a subject 
of foundational studies, both in the object-based and in the class-based setting (see. for 
example, [1,8,1 1]). The work done in the past has highlighted the importance of typing 
self in a careful way. 

The gain of introducing an appropriate type for self is evident when a form of 
inheritance is present, whether a class-based one (via class hierarchies), or an object- 
based one (via method addition/override). In fact, an appropriate type for self would allow 
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the automatic specialization of those inherited/overriden methods that either return the 
host object, and/or have some parameters of the same type as the host object ( binary 
methods). This specialization (also known as MyType specialization) can be seen as 
an alternative to typecasts, which are explicit declarations made by the programmer 
on the expected actual type of the method result according to the type of the object the 
method is invoked upon. Typecasts are unsafe — the programmer must be “sure” about the 
actual type of the returned object, since little or even no static checking is performed on 
typecasts, as we observe, for example, in Java or in C++. Moreover typecasts certainly do 
not improve readability of code, with a negative impact on the debugging phase. As hinted 
above, an alternative to typecasts is the introduction of selftype, with the meaning “the 
type of the current object”, i.e., “the type of self”, to annotate appropriately binary 
methods and methods that return the host object. Some type systems including selftype 
are presented in [1,8,11], 

In this paper we consider the problem of type inference (also called typability) in 
presence of general (arbitrarily nested) selftypes in an object-based setting. Very little is 
known about type inference with selftypes. To the best of our knowledge, only Palsberg 
and Jim addressed this subject. In [16], they study the type inference problem for one 
of the Abadi-Cardelli systems [1] extended with the notion of selftype. In [15], Palsberg 
presents an algorithm for Abadi-Cardelli’s four first-order systems (without any form of 
selftype), proving that the type inference problem for all four systems is P-complete 1 . 
The work [16] can be seen as an application of the techniques developed in [15], and 
it contains a proof that the type-inference problem for an Abadi-Cardelli system with 
recursive types and width subtyping extended with a simple form of selftype is NP- 
complete. 

The Palsberg-Jim “tiny drop of selftype”, as the authors themselves point out, consists 
of: (i) the use of the keyword selftype, instead of a bound variable, to stand for the 
selftype in object types; (ii) the restriction that each occurence of selftype “comes with 
its context”, i.e., in their type system selftype can appear as a component of an object 
type only, never in isolation. These two choices imply the following consequences: (i) it 
is not possible to refer to the selftype of enclosing outer objects (i.e,, there are no nested 
selftypes); (ii) it is not possible to override those methods that return the object itself 
(i.e., of type selftype). 

Of the above two restrictions, the first one seems to be essential. It implies that the 
access to two or more different selftype’s, i.e., two or more different environments, is 
impossible. Indeed, the “tiny drop of selftype” of Palsberg and Jim can be encoded in a 
system without selftype (see [9]), meaning that it is a rather weak form of selftype. 

We plan, therefore, to analyze the decidability of type inference for a type system 
that relaxes the above limitations. The system under study is based on the calculus 
presented in [3] (we will call it C from now on). The calculus C is an untyped version 
of the calculus introduced in [5], in order to analyze throughout a functional encoding 
the type system of [4] (hereafter BB), which is, in turn, a simplification of the Lambda 
Calculus of Objects (hereafter LCO) of Fisher, Honsell, Mitchell [11]. The calculus 
LCO is a functional object-based calculus enriched with object primitives. Operations 
allowed on objects are method addition, method override, and method invocation (also 

1 In [12] this result is improved. 
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called send). Method bodies are functions; in particular, self is modelled using a lambda- 
abstracted variable. In LCO: (i) it is possible to refer to the selves of the enclosing objects; 
(ii) override is as general as possible. Selftype is rendered by using the row-variables 
of [ 14] to characterize types of methods as type-schemes (i.e., types polymorphic in these 
variables), and to enforce correct instantiation of the schemes as methods are inherited. 
The type system of C we base our paper on inherits some of the fundamental ideas from 
the original system in the modelling of selftype. The main difference is avoiding the 
use of row-variables to model selftype, but exploiting instead Bruce’s matching [7] and 
implicit match-bounded quantification over type variables, as studied in the BB calculus 
of [4], The designers of LCO conjectured that type inference for LCO was undecidable, 
but nobody has proven that yet. We focus on the same type inference problem for C, 
which has a simpler (yet as expressive as) type system than LCO. 

Due to the generality of system C, we decided to tackle the related type inference 
problem in steps. First of all, we discarded the method addition operation. Method 
addition does not add much to the inherent difficulty of the problem of type inference, 
because it is performed on objects but it is forbidden on selves 2 . Second, we also discard 
method override, because method invocation turns out to create a surprisingly non-trivial 
problem by itself. The possibility of referring to nested selves of enclosing outer objects 
creates “reference loops” which are difficult to untangle. 

It might seem natural to identify objects with recursive records, and hence to identify 
their types with recursive types, but this is misleading. In fact: ( i ) such choice is not 
adequate already in our setting with method invocation only, because of the generality 
of our selves (see the examples in Section 2); (ii) generally, this solution does not work 
in an enlarged setting with method override and/or addition because the meaning of self 
changes as operations on the host object are performed (see [1], Section 6.7.2). 

We make a number of simplifications in our syntax. For instance we consider objects 
with exactly two methods, and put a “constant” □ as a place-holder wherever we mean 
“an irrelevant subexpression”. These simplifications do not influence the essence of the 
problem, and make it easier to isolate the basic issue: type assignment in presence of 
multiple selves. The place-holder is just there to hide whatever does not influence typing 
itself. The presence of two fields only rules out the so-called message-not-understood 
run-time errors, as sends are limited to those two components. Even though catching 
statically such errors is a primary task of type systems for object-oriented languages, 
the task of testing a sort of “well-formedness” of objects is essential as well, and this is 
what our typability algorithm does. 

The paper is organized as follows. Section 2 introduces the most basic syntactic 
categories, terms and types, and explains the motivation of our type assignment. In 
Section 3, we elaborate the syntactic notions used in the paper. Section 4 presents a 
type assignment system. In Section 5, we introduce the main tool in establishing a sort 
of “principal type scheme” property and the confluence property of a certain system 
of reductions. The principal technical part of the paper is Section 6. It is devoted to an 
algorithm which transforms a given term into a “type scheme”, or reports a failure. Two 
classes of redexes are introduced: reducible and cyclic. Among the reducible redexes 

2 There are calculi that deal with self-inflicted method addition such as [10], but they go beyond 
the goal of this paper, which is about classical object-based calculi. 
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there are redexes which we call inconsistent. They are the only source of a possible 
untypability of a term in our setting. Section 6.4 contains the main technical properties 
of the reduction system: confluence and termination (Theorem 1 ), and recovery of typings 
(Theorem 2). Section 7 contains the main result of the paper. Theorem 3 . It states that there 
is a type scheme assigned to every (and only) typable term such that all the instantiations 
of this scheme give correct typings of the term. This principal type scheme can be 
effectively obtained if and only if the term is typable. Therefore the typability problem 
is decidable. Section 8 gives an informal account on how to deal with method addition 
and message-not-understood run-time errors. 

Due to space limitations, we have omitted from this presentation all proofs and many 
auxiliary definitions which are not essential for understanding the main results of the 
paper. The details can be found in the full version of the paper in: 
http : //www . di .unito . it/^bono/Manuscripts. 



2 Terms and Types 

Assume an infinite set of variables (selves), with the notation s,t,. . . A term is either 
a variable, or a place-holder indicated with □, or: 

- an object, i.e., an expression of the form pro s( Mi, M 2 ), where Mi and M 2 are 
terms, or: 

- a send, i.e., an expression of the form M i, where M is a term and i £ {1, 2}. 

The operator pro s binds the self s. Alpha conversion is assumed. The notation F V (M) 
and M[N/s] is used accordingly (with □[A r /s] = □). 

The intended meaning of “pro s{ Mi , M 2 )” is an object with two methods Mi and 
A/ 2 , which may refer to the whole object via the self variable s. In the notation of [1] 
this would be written as ( <;s.M\,<;s.M 2 ). The meaning of “M 4= i” is to extract the 
t-th method from the object M, and the operational semantics is given by the following 
reduction rule: pro s( Mi , M 2 ) <= i Mj [pro s(Mi , M 2 )/s] . 

The place-holder □ may be seen as a representation of a piece of code (i.e., a subex- 
pression) which is irrelevant with respect to typing. Our objective is to study the structure 
of self-references occurring in object expressions, by means of a mathematical abstrac- 
tion. For the purpose of the analysis of our abstract model, anything that does not contain 
self-references is considered irrelevant and thus can be represented by a □. Clearly, a 
message sent to an irrelevant target is irrelevant too, so we do not find anything wrong 
in postulating the reduction □ <= i □ , which expresses exactly the idea of ignoring 

the “contents”of □. On the other hand, the expression s 4= i has some meaning, but it 
cannot be evaluated until we substitute an actual object for s. 

We want to assign types to expressions of our language. The basic idea is that a type 
assigned to pro s( Mi , M 2 ) should be essentially a product of types assigned to Mi and 
M 2 . Thus, we would like to assert something like pro s{ 3, 5 ) : (( int , int )), provided 
we know that 3,5: int. 

In general, the type of a pure object (an expression without sends) should correspond 
to the shape of the object. If an object refers to a self s, the natural choice is to use a type 
variable t, corresponding to the self s and assert pro s( s, 5 ) : St (( t, int )}, where the 
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s 






5 : int 



s 





Fig. 1. The structure of a term w.r.t. selves 



operator St binds t within (( . . . )). This can be extended to more complex pure objects, 
e.g., pro s( pro s'( s', 5), pro s'( s,s')) is of type St (( St'({t', int}}, 6t'(( t , t' }} }}. 
We can simply identify a pure object with its own type. Indeed, the only difference 
between the term and the type in the last example is where the constant 5 occurs. All 
the rest is just syntactic sugar. The essential part, which is the structure of the self- 
references, is the same, up to renaming, in the term and in the type, and can be drawn 
as the tree in Fig. 1. This justifies our definition of a type as a term not containing <=. 
An assignment of such a type r to an expression M containing occurrences of 4= means: 
M is as good as a pure object of type r. Moreover, as said above, we do not really care 
about the place-holder and its types, because the place-holder represents an irrelevant 
subexpression with respect to typing, therefore we assume □ : □ for the place-holder □. 

Our type assignment should enjoy the subject reduction property, i.e,, we want M' : 
r, whenever M A/ 7 and At : r. This requirement determines what the type assignment 
rules should be. First of all, observe that M : pro s( tt , T 2 ) should imply that M <= i 
is of type r, [,s := pro s( t\,T 2 )]. It is less obvious which type should be assigned to 
a send of the form s^i. Clearly, our identification of an object and its type requires 
a uniform principle s : s. That is, self is of type self. The type of s 4= i should 
depend on the context in which the expression occurs. Consider as an example the term 
M = pro s( pro t( □, s 4= 1 ), □ ), depicted as the (A) tree in Fig. 2. It may be tempting 
to assert s <= 1 : t, because s 4= 1 certainly points to the root of the object identified 
by the self t. This amounts to understanding an object type pro t { . . . ) as a recursive 
type fit ( . . . ), that may freely be replaced by ( . . . )[fit( ...)/<]. That, however, would 
be wrong: consider the expression M <= 1. We have: M 4=1 pro t( □, M<=1 } 
pro t( □, pro t{ □, M 4=1 ) ) . From this reduction sequence we can see that no 

finite object type can be assigned to M, as the expression develops into an infinite tree. 
Thus, M should not be typed at all. 

Note that the idea of a recursive type /jt( □, f ) is not adequate here, which can be best 
seen if we modify M to M’ = pro s( pro t(t,s 4= 1 }, □ ). While M’ expands to an 
infinite tree in reduction, it is not a full binary tree! Another reason why we do not want 
to use recursive types is that we want to distinguish between pro s( 4, pro s( 2, s ) ) 
and pro s( 2, s }. 

The problem we encountered in the above example does not occur, if we consider 
the term N = pro s( pro f(n,s*t=l*t=l},D). The picture is now represented as the 
(B) tree in Fig. 2, and the type of s <= 1 <= 1 in this context should undoubtedly be □. 
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□ S 4= 1 □ S 4= 1 4= 1 



Fig. 2. Terms as trees. 

So what is the type of s <= 1? Now we see it must be pro 7( □, □ ). But how can we 
derive it? For this we need to know the type of TV from an environment that assigns to s 
the type of the object that s points to. (It is not a type of s as we always have s : s.) We 
arrive at the following rule 



t : pro t{ n, r 2 ) b 717 : t 
t : pro t ( T \ , r 2 ) b M 4= i : r, ’ 

Thus to derive the type for TV we must first guess it, put it into an environment in which 
we derive types of the components of TV, and finally we apply the following rule for 
typing objects 



s : pro s( ri , T 2 ) F Wi : n , s : pro s( n , r 2 ) b 7V 2 : r 2 
b pro s{N 1 ,N 2 ) '■ pro s(n,r 2 ) 

to eliminate the initial guess from the environment. The need of guessing the final type 
of a complex expression, before type-checking begins, makes it difficult to apply any 
structural approach to type inference. The problem becomes even more involved in 
the presence of an interaction between “external” sends to an object expression and 
“internal” sends occurring within that expression. 

2.1 The Roadmap of Notions 

We conclude this informal introduction with a brief description of several syntactic 
categories used in the course of the proof of the main decidability result. We use the 
following subsets of the set of all terms, ordered as follows: 

types C quasi types C stripped terms C terms. 

Stripped terms are terms in which all applications of the send operator 4= are ‘stripped 
down’ to leaves, i.e., 4= occurs only in the context s 4= 77, where s is a self and 
II £ {1, 2}* is a non-empty path. Moreover, if an occurrence of s 4= 77 is bound, 
then the binding pro is the outermost pro of the term. The main technical part of 
the algorithm which decides typability is concerned with stripped terms. The strategy 
of the algorithm consists in rewriting a given stripped term, trying to eliminate bound 
occurrences of s 4= 77. In this way we arrive at the next syntactic category of terms: 
quasi types. A stripped term without bound occurrences of s 4= 77 (with 77 ^ e) is 
called a quasi type. Hence a quasi type is a term in which all applications of the send 
operator are ‘stripped down’ to the leaves and every such an occurrence is free, i.e., no 
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pro binds a self s which is in the context s <= 77, with II f e. Quasi types behave in 
several respects similar to types: a quasi type is always typable and moreover its type is 
uniquely determined by the environment. Finally the smallest syntactic category, types, 
consists of terms in which no send operator occurs. 

Since we are interested in a form of principal typing, we have to allow metavariables 
which range over types. In this way we obtain a class of meta schemes — these are just 
like ordinary terms, except that they may contain metavariables. Again, meta schemes 
are stratified syntactically in a similar way to that described above. Thus we have: 
quasi type schemes C stripped schemes C meta schemes. 

Quasi type schemes are produced by the algorithm of the paper for each typable term, 
and only for such terms (see Theorem 3), which is the main result of this paper. 

3 Technical Background 

If 77 £ {1, 2} + , then we define M <= II by induction: M <= Eli := M <= 77 <= i. 
Occasionally, we use the notation M 4= II, even if 77 can be empty, identifying M <= e 
with M. We call every send of the form s 4= 77, where 77 ^ e, an atomic send. A top 
send in an object M = pro s.( Mi, M 2 ) is an atomic send s <= 77 bound by the top 
pro s in M. The length of 77 is the length of the send s <= 77. We say that an atomic 
send s <= 77 is free in M if s £ FV (M). 

If a term does not contain non-atomic sends, it is often convenient to think of it as 
a labelled binary tree. Internal nodes are labelled by selves and leaves are labelled by 
the place-holder, selves or sends. Nodes are identified with paths leading to them. For 
a string r £ {1, 2}* and a term M, if F leads in M to a node we will say that F is 
contained in M and write /' £ M. For F £ M we can also refer to a label of 7 meaning 
the label of the node to which r leads in M. 

A type is a term not containing <=. In particular an object type is a type which is also 
an object. A quasi type is a term in which all sends are atomic and free. A stripped term 
is a term of the form pro s.( Mi, M 2 ), where M\ and M 2 are quasi types. Thus in a 
stripped term all bound sends are top sends. 

A self declaration is a pair of the form s : t, where r is an object type. An environment 
E is a sequence of self declarations, such that no declaration in E involves a (free) 
variable declared later on in E. More precisely, the definition of an environment, its 
domain Dom(7?), and the set FV (E) of free selves of E is stated inductively as follows. 

- The empty sequence 0 is an environment, and Dom(0) = 0 = FV (0). 

- If E is an environment, s is a self such that s ^ FV(E), and r is a type, then 
E' = E, s : t is an environment, with Dom(7? / ) = Dom(7£) U {s} and FV {E') = 
FV(E) U FV(t). 

We will use the convention that if s : t is a declaration, then r is of the form r = 
pro s.(ti,t 2 ). For s £ Dom(77), we write E(s) = r if r is the type which is assigned 
to s by the rightmost declaration for s in E. 

3.1 Formal Field Selection 

Given a quasi type T and 77 £ {1, 2}*, we define a quasi type T.77, called a formal field 
selection : 
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- T.e = T, 

- n.n = □, 

- (s <t= r).II = s <= m, for r £ {l, 2}*, in particular s.TT = s <= 77, 

- ifT = pro s.( Xj, T 2 }, then T.ill = (Ti[T/s]).II. 

Let us stress that, by the above definition, the notations s.TT and s <= 7/ are interchange- 
able. 

In the last clause of the above definition the substitution T) [T / s] is just the ordinary 
substitution of T for all free occurrences of s in 7j. Notice that in this case no free 
occurrence of s in T) is a free send. Otherwise, the result of the substitution is not 
necessarily a quasi type. That is, quasi types are not closed with respect to ordinary 
substitutions. The general case of substitution of quasi types is dealt with in the full 
version of the paper. 



3.2 Evaluation of Stripped Terms in an Environment 

Given a stripped term AT, we define a stripped term (AT) e, called the value of AT in the 
environment E, as follows. 



- (s)e = s, 

- (□)£ = □, 

- (s 4= HI)e = ( Tj.T7)e , whenever E(s ) = pro s.(ti,T 2 ), 

- (s <= ill) e = s in, if s ^ Dom(i7), 

- (pro s.( Mi, M 2 )) e = pro s( (ATi) b , (AT 2 )_e ), where s £ FV(E) Us ^ 
Dom(i7). 

Note that the above definition is correct, i.e., that the induction is well-founded. 

Lemma 1. Let T be a quasi type and let II £ {1, 2}*. Then for every environment E, 
we have (( T)^.T7)b = (T.n)E- In particular, ((s 4= T)E-n)E = (s <= m)E- 



4 Type Assignment 



A type judgement takes the form E b AT : t, where E is a type environment, AT is 
a term and r is a type. The rules are listed below. In (obj) we use the abbreviations 

r = pro s( Ti, T 2 ) and AT = pro s{ Mi, AT 2 ). 



(const) 






(var) 



E b s : s 



(obj) 



E,s : t b Mi : n, E,s : t b AT 2 : r 2 
E b AT : r 



E b AT : r 



(send) 



E b AT 4= i : ( t.i)e 



(if (tA) e is a type) 
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First of all, observe that the understanding of E b M : t is nonstandard. The environment 
E does not provide types of free variables, as usual, but only “type bindings’’ used 
exclusively for typing sends. The type assigned to a free variable is always the variable 
itself. In particular, one does not need to assume free variables of M to be in the domain 
of E, provide that there is no (direct or indirect) send involving these variables. For 
instance we have b s : s, but to type pro s( s,t ) 4= 21 we need a type binding for t. 
Furthermore, notice that the type of the place-holder □ is the place-holder itself, and 
this reflects our idea that the place-holder stands for ignored sub-expressions. Below we 
illustrate the features of the system with some examples. 

Example 1. Not every term is typable. Consider the following stripped term M = 
pro s(pro t(s <= 1, □),□); we show that M is indeed untypable 3 . Assume that 
b M : t, for some type r. It follows that r must be of the form r = pro s.( ti, T 2 ) and 
that we must have a derivation of s : r b pro t( s <= 1, □ ) : t\. Now, again t\ must be of 
the form ti = pro t ( rn , T 12 ) and we must have a derivation s : r, t : t\ b s <= 1 : rn . 
Thus Tn = (ti )e = Ti, where E = {s : t, t : Ti}. This yields a contradiction. 

Observe that the type of a term is not uniquely determined by the term and the environ- 
ment (see the Example 2). However, it can be shown that the resulting type of a quasi 
type is uniquely determined by the environment. 

Example 2. Consider now a stripped term M = pro s(s <= 12, s <= 112 ). The reader 
will easily check that the following typings are derivable in the system. 



b M : pro s{ □, □ ) (1) 

b M : pro s(pro f(pro x(y, z),t ),z) (2) 

b M : pro s(pro f(pro x(y,s),t),s) (3) 

x : pro x(y,z) b M : pro s(pro t(x,t),z) (4) 

t : pro t(pro x(y,z),t) b M : pro s{t,z) (5) 

x : pro x(y,z), t : pro t(x,t) b M : pro s(t,z) (6) 



Types assigned to M in (1) and (2) are clearly of completely different nature. Also 
the types in (2) and (3) are different due to the different structure of the bindings. 
Environments in (4)— (6) are used to type atomic sends of M. 

We remark on passing that the above type assignment system has the subject reduction 
property (for details see the full version of the paper). 

The reader familiar with [4] will notice that our type bindings are directly inspired 
by the idea of “matching types”. A direct comparison between the present system and C 
of [3] is possible: our syntax of terms is different to that of C , but if we forget about the 
syntax of terms, a closer look reveals that our rule (obj) corresponds to (two applications 
of) rule (Val Method Addition) of C, and rule (send) is essentially the same as C'\s rule 
(Val Select). 

3 Compare this to the example of Fig. 2 (A) discussed in Section 2. 
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5 Meta Schemes 



We introduce the meta schemes and their instantiations in order to state the principal 
quasi type theorem. First we introduce a new category of variables, called metavariables. 
For each path A G {1, 2}* we have a countable supply of metavariables a A (possibly 
with subscripts, when necessary). Each a A can be instantiated with a type which has to 
satisfy a certain property to be stated later. Metavariables play the same role as selves, 
except they cannot be bound by pro . In particular, the send operation is applicable to a 
metavariable. 

We start with meta schemes, 71 They are built according to the following grammar 

T ::= □ | s 4= II | a A 4= II | pro s.(7!,71), 

where A and II range over {1, 2}*. We identify s 4= e with s and a A 4= e with a A . 
Expressions of the form a A 4= II, where // ^ e, will be called meta sends. Let TV (T) 
denote the set of all metavariables which occur in T and let FS(T) denote the set of all 
sends s 4= 77 which occur free in T (i.e., s is free in T). 

A meta scheme in which all sends are free is called a quasi type scheme. Observe that 
a quasi type scheme without metavariables is a quasi type. A stripped scheme is a meta 
scheme in which bindings of sends occur only at the top, i.e., T is a stripped scheme if 
it is of the form: □, s 4= 77, a A 4= II, or pro s.( 71, 7-2 ), where 71 and 71 are quasi 
type schemes. So, a stripped scheme without metavariables is a stripped term. 

Most of the definitions which are applicable to terms are also applicable to meta 
schemes. For example, the definition of formal field selection can be extended to quasi 
type schemes by adding the clause for metavariables: (a A 4= r).n = a A <= rn. 

An instantiation of a meta scheme T is a pair ( E , S), where E is an environment 
and S' is a substitution which assigns to every metavariable a A € TV (T) a type p such 
that ( p.A) E = p- 

For a substitution S which assigns types to metavariables in TV (T), by T{S} we 
denote the term obtained by substituting types for metavariables in 71 The definition of 
T{S}isby straightforward induction, the only nontrivial clause being ( a A -4= II) {S} = 
S(a A ).II. Of course, we perform o-conversion, when necessary, in order to avoid send 
capture. Clearly when T is a quasi type scheme then T{S) is a quasi type; similarly for 
stripped schemes. 

For a stripped scheme 71 the value of T in an instantiation (E, S ) for T is the 
stripped term (TIS'De. 

Meta schemes 71 and 71 are said to be equivalent if: 

- for every instantiation (E, S) of 71 there is a substitution S' such that ( E , S') is an 

instantiation of 71 and (Ti{S})e = (71{5"}) £■; and 

- for every instantiation (E, S) of 71 there is a substitution S' such that {E, S') is an 
instantiation of 71 and (71 {S"}),e = (71{<S})_e- 

A meta scheme T is said to be typable if there is an instantiation (E, S) of T and 
a type r such that E \- T{S} : t is derivable. 
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6 The Rewrite System 

The aim of this section is to give rewrite rules for transforming a given stripped scheme 
into a quasi type scheme. The transformation is going to be a partial function, i.e., 
for some stripped schemes there will be no corresponding quasi type scheme. We will 
describe two kinds of redexes: reducible and cyclic. First we need an auxiliary definition 
with which we can define the redexes. 

6.1 The Projection and Remainder Functions 

For a stripped scheme T we define a pair of functions: a projection function pj- : 
{1,2}* — > T and a remainder function rj- : {1,2}* — > {1,2}*. Intuitively p-j-(II) 
is a node of T which is obtained by travelling in T along II, subject to the following 
conditions. If 77 is contained in T then we terminate at 77. Otherwise we apply the 
following rules for passing through a leaf 77 

- if r is labelled by a self t which is bound at node A, then the next step starts at node 

A; 

- if r is labelled by the place-holder, then we return to this node in the next step (and 
thus in all following steps); 

- if r is labelled by a free send, a meta send, or a top send, then we terminate at this 
node, i.e,, no next step is possible. 

Then rp(II) is what remains of 77 upon the termination of the navigation through T. 
The formal definition now follows. 

Case A: (77 £ T) 



Pr{n) = 77 and rp{II) = e 

Case B: (77 1 77 2 £ T is a leaf labelled t, II\ is labelled t, and // 2 f £ and A f e) 

p r (77i77 2 Zl) =p r {n 1 A) and rr(77i77 2 Z\) = rr(IIiA) 

Case C: (A e and 77 £ T is a leaf labelled by one of the following: a free send, 
a top send, a meta send) 

p r (nA) = 77 and rq-{IIA) = A 

Case D: (A e and II £ T is a leaf labelled by the place-holder) 

Pt(IIA) = 77 and r-j-(IIA) = £ 

6.2 Reducible Top Sends 

A top send s 4= 77 is said to be reducible if pj- (II) is not an occurrence of a top send. 

Among reducible top sends are those which we call inconsistent. A top send s 4= 77 
is said to be inconsistent if p'j-(II) = iA, for some i and A, and s 77 occurs in 7 fA. 
A reducible send which is not inconsistent is called consistent. 
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Lemma 2. If a stripped scheme contains an inconsistent top send , then it is not typable. 

Let s <= 77 be a reducible top send in T, and let pj-(II) = iA and rj-(II) = £ . 
Reduction of s <= II consists in replacing every occurrence of s <= II in T by 'T,.Af. 
It follows that A € 7i and that we have the following two possibilities: 

1 . A is an internal node of %■ Then £ = e. 

2. A is a leaf in %■ Then the label of this leaf is one of the following: 

2a. A free send in T ; 

2b. A meta send; 

2c. The place-holder. 

In each case ((1) or (2)) it follows that the subtree Ti.A £ does not contain new top 
sends, i.e., there may be new occurrences of top sends after the reduction, but the set 
of all different top sends after the reduction is not larger than before. In fact, when 
the reducible top send is not inconsistent, the number of top sends after the reduction 
decreases by one. 

The intuitions behind the previous concepts are: (;) an inconsistent reducible top 
send addresses a subtree of the tree representing the term in question which contains 
the top send itself, meaning that the top send’s type should contain itself properly (see 
the first example in Section 2); (ii) a consistent reducible top send is one for which we 
can mimic the evaluation process, by substituting it with the subtree it addresses. This 
way we make a step towards a send-free term, which will correspond to the quasi type 
scheme. 



6.3 Cyclic Top Sends 

Let Sj- be the set of all occurrences of top sends in T. The projection and remainder 

functions give rise to two mappings pj- : Sj- —> T and rf : Sj > {1, 2}*. For any 

r £ Sj-, if the label of T is s *£= 77, then pp(r) = pj-(II) and fj-(r) = rj-(II).A top 
sends 77 is said to be cyclic if for one of its occurrences r £ Sj- we have Pj-(r) = r 
for some k > 1. It follows that the occurrence /' is unique. We call it a cyclic occurrence 
of s 4= 77. The least k satisfying Pj-(r) = 7’ will be called the period of s -i= II. The 
word r'7-(p^ 1 (L)) • • • fj-{pj-(r))ff(r), where r is the cyclic occurrence and k is the 
period of s 77, will be called the cyclic coefficient of the cyclic send s 4= 77. 

Let s 4= 77 be a cyclic top send in T and let A be its cyclic coefficient. Reduction 
of s 4= 77 consists in replacing every occurrence of s <= 77 in T by Oi A , where a A is 
a fresh metavariable not occurring in T. 

Itfollows that sends whichlabel the nodesj?7-(7 n ), . . . ,p^-~ 1 (L) are also cyclic in T. 
After the reduction the send labelling the node p T 1 ( 7 j becomes reducible, while the 
other sends are not subject to immediate reduction in the new scheme. 

The intuition behind a cyclic send is that it represents an infinite computation (infinite 
computations are universally accepted in object-oriented calculi; see typical examples in 
[1,11]). Essentially, a cyclic send refers to itself within a certain number of computation 
steps, which is the period. 
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6.4 Confluence and Termination 

The main properties of the above rewrite system are collected in the next two results. 

Theorem 1. Let T be a stripped scheme. 

1. (Termination) Let n be the number of top sends in T. After n steps of reduction 
we either arrive at a quasi type scheme, or else we must have earlier detected an 
inconsistent reducible send. 

2. (Confluence) Let ‘T’ and T" be two quasi type schemes obtained from T by a se- 
quence of reductions. Then T~' and T~" are equivalent. 



Theorem 2. Let T be a stripped scheme. The following are equivalent. 

1. T is typable. 

2. There exists a sequence of reductions which transforms T into a quasi type scheme. 

3. Every sequence of reductions transforms T into a quasi type scheme. 

Moreover, ifT ^ is a quasi type scheme obtained from T by a sequence of reductions 
and ( E , S) is any instantiation ofT & such that (7 ~&{S})e is a type, say t, then the 
judgement E F 1~{S} : r is derivable. 

It follows from Theorem 1 that every stripped scheme has at most one normal form 
up to scheme equivalence. 

Example 3. M = pro s( pro £( □, s <= 1 ), □ ): M is a stripped term, and the top 
send s <= 1 is reducible and inconsistent, because pm{ 1) = 1 and s <= 1 occurs in 
the sub-tree Mi (following the definition of Section 6.2). Thus M is not typable by 
Lemma 2. 

Example 4. M = pro s( s 4= 1, □ ): M is a stripped term, and the top send s 4= 1 
is cyclic. Let T = 1 be the "address" of the top send s <= 1 (this is correct since such 
top send is the “1 -branch” of the “tree” M) and II - 1 (being s 4= 1). Following the 
definitions of Section 6.3, we calculate pm{ 1) = Pm{ 1) = 1 (i-e., the period is 1) and 
r M (n = 1) = £ (i-e-- the cyclic coefficient is empty). Then we get pro s( a, □ ). 

By assigning to a a type p such that p = p (the top send is cyclic, and the type assigned 
to a takes this into account), we have that this term is typable with any type for the first 
component, the simplest being pro s( □, □ ) and pro s( s, □ ). 

Example 5. M = pro s( s 4= 12, s <= 112 ): M is a stripped term, and the top send 
s 4= 12 is cyclic. We calculate, for T = 1, pm{ 1) = Pm ( 12) = 1 (i.e., the period is 
1), and ?m( 1) = Tm( 12) = 2 (i.e., the cyclic coefficient is 2), therefore we obtain the 
quasi-type schema pro s( a 2 , s <= 112 ). Now s 4= 112 becomes reducible and we get 
pro s(a 2 ,a 2 <= 12) by calculating pm(H 2) = 1 and rM(ll 2) = 12 (following the 
definitions of Section 6.2), and then by substituting s 112 with Mi. 12 = a 2 4= 12. 
By assigning to a 2 a type p such that (p. 2) = p, we can get M : pro s( □, □ ), 
M : pro s(pro f(pro x{y,z),t)z), etc. 
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Example 6. M = pro s( pro t(s 4= 121, s 4= 221 }, pro t(s4=l,s<= 122 )): M is 
a stripped term. The top sends s 4= 221 and s 4= 122 form a cycle: we can solve the 
cycle by starting from the former or from the latter. For example, we start from s 4= 221. 
We calculate Pm( 12) = Pm( 221) = 22 and p 2 M { 22) = pm{ 122) = 12 (being 22 the 
“address” of s 4= 122), so the period is 2. We also calculate Tm{ 12) = Tm{ 221) = 1 
and t\ i { 12) = 122) = 2, so the cyclic coefficient is 21. Then M rewrites into 

M' = pro s{ pro t(s 4= 121, a 21 }, pro t{ s 4= 1, s 4= 122 ) ). 

Now we consider the top send s 4= 122 that become reducible (with pm{ 122) = 12 
and tm( 122) = 2), and we substitute it with M[. 22, that is, M' rewrites into 
M" = pros(prof(s 4= 121, a 21 }, pro f( s l,a 21 2 )). With two more 

rewriting steps, one for s 4= 1 and s 4= 121, both reducible, we rewrite M" into 
M'" = pro s( pro t( s 4= 121, a 21 ), pro t( pro t(s <= 121, a 21 ), a 21 <= 2 ) ) and 
M lv = pro s(pro t(a 21 4= 1, a 21 ), pro t( pro t( a 21 4= l,a 21 ),a 21 4= 2)), 
proving that M is typable. 

If we start from s 4= 122 to solve the cycle, we would substitute a 12 at the “ad- 
dress” 22 as the first “rewrite” step, instead following the above solution we substi- 
tuted a 21 at the “address” 12. 

7 Main Result 

We define a partial map which assigns to a term M a quasi type scheme 7m, called 
a principal quasi type scheme of M. The partial map is defined by induction on M. 

- T c = c 

- T s = s 

- 7p ro s .( Mi,m 2 ) — (pro s(7m 1 ,7m 2 )) # 

- T M ^i = Tm-I- 

The above recurrence equations must be understood in such a way that the left hand side 
is defined if and only if the right hand side is defined. Note that the above definition 
is correct, i.e., that the induction is well-founded. The intuition is that, by induction, 
7mj , Tm 2 are quasi type schemes, therefore (pro s(7m 1 ,7m 2 )) # is a stripped type 
scheme. 

The main result of this paper is the following theorem. 

Theorem 3. (Principal quasi type theorem) 

1 . M is typable iffl~M is defined. 

2. //’7m is defined , then for every instantiation ( E , S) ofl~M such that e is 

a type we have E h M : ( Tm{S }) e- 

3. The partial mapping M > Tm is computable. Therefore the problem of typability 
is decidable. 



8 Extensions 

We have solved the type reconstruction problem for a system containing only the send 
operator to highlight the essential mathematical content of the problem itself. However, 
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the approach can be extended to deal with the method addition operator i — b of the C 
calculus [3], and with message -not -understood run-time errors, without changing the 
mathematical core of our solution. 

Method addition. We must consider objects with an indefinite number of components. 
Then, since method addition is permitted only on proper objects, it is enough to extend 
the notion of “principal quasi type scheme” 7m, in order to check that the (quasi) type of 
the object receiving the addition is a pro (...) (quasi) type and that it does not contain 
the method to be added, together with checking that the method body is typable. 
Message-not-understood. By allowing objects with more than two components and 
method addition, the send operator becomes general: method invocation is allowed on 
proper objects (external send) and on selves (self-inflicted send). In the external case, 
the right-hand side of the equation Tm<=i = 7m 7 of the principal quasi type scheme 
would be satisfied (and Tm<=% would be typable) if 7m were a quasi type scheme of the 
form pro (...) containing an i component, and the resulting quasi type scheme would 
be as in the two-method situation. A more difficult case is when the send is self-inflicted, 
i.e., if M is a self s: this case must be solved directly during the global process of going 
from the stripped term containing M i to its quasi type scheme, because we need 
to check if the subtree rooted at s has an i branch. In order to do so, for every top send 
s 4= 17 we must check that the branching described by 77 exists in the subtree rooted 
at s. 

9 Conclusion and Future Work 



We have shown that decidable type reconstruction is possible for languages with nested 
selftype references. We believe that our result can be the core for typability of richer 
systems. 

Our result raises a number of further questions. Future work will include a detailed 
comparison, from the point of view of the typability, of our type system with the Palsberg- 
Jim system [16], restricted to our calculus, and with some of the Abadi-Cardelli systems 
[1] (in particular the first-order one, to begin with, and the ones with selftypes). 

Obviously, one wants to expand the analysis to the case of object languages with 
a more reasonable choice of operators. Adding method addition must be still formalized, 
but we conjecture that is nothing more than careful bookwork. Dealing with message- 
not-understood appears to be more delicate, because it implies an extension of the 
algorithm as hinted above. However, it does not change the techniques we use to detect 
and solve “loops”, which are the central part of our solution. Override is, instead, an open 
question at the moment. Thus far it can be only shown that adding method override makes 
the problem PTIME-hard. Intuitively, override, by substituting method bodies, may 
change the interrelationships among the cyclic top sends, inducing complex equational 
constraints — a very special case of second-order unification. It appears that self-inflicted 
overrides (i.e., overrides on selves inside method bodies) are the main issue. Nevertheless, 
also external overrides introduce some difficulties. To type a method override on an 
object, we would need to compare the (quasi) type of the overriden (old) method body 
with the (quasi) type of the overridding (new) one. Then the override is typable only if 
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the two are “equal”, but we still do not have a complete notion of principality, therefore 
we are not able to decide on equality among (quasi) types 4 . 

Even for our simple language, there are still issues to be investigated. The naive 
algorithm, involving the construction of 7m, is obviously not feasible, as it involves 
nested substitutions. Although we believe the problem is solvable in polynomial time, a 
workable implementation is still to be developed, and does not seem to be trivial. 



Acknowledgement. The authors would like to thank the anonymous referees for their 
advice on how to improve the paper. 
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Abstract. We consider the problem of efficient representation of de- 
pendently typed data. In particular, we consider a language TT based 
on Dybjer’s notion of inductive families [10] and reanalyse their gen- 
eral form with a view to optimising the storage associated with their 
use. We introduce an execution language, ExTT, which allows the com- 
menting out of computationally irrelevant subterms and show how to use 
properties of elimination rules to elide constructor arguments and tags 
in ExTT. We further show how some types can be collapsed entirely at 
run-time. Several examples are given, including a representation of the 
simply typed A-calculus for which our analysis yields an 80% reduction 
in run-time storage requirements. 



1 Introduction 



Dependent type theory provides programmers with more than an integrated 
logic for reasoning about program correctness. It allows more precise types for 
programs and data in the first place, strengthening the typeclrecker’s language 
of guarantees. We have richer function types \/x:S. T which adapt their return 
types to each argument; we also have richer data structures which do not just 
contain but explain data, exposing and enforcing their properties. 

Moreover, we may reasonably expect more static detail about programs and 
data to yield better optimised dynamic behaviour. We need neither test what 
is guaranteed nor store what is determined by typechecking. Pollack’s implicit 
syntax [23] already supports the omission of much redundant information from 
concrete syntax for similar reasons. 

This paper idenitifies some space optimisations which significantly reduce 
the storage overheads associated with inductive families in the sense of [10]. 
These are data-indexed collections of mutually recursive datatypes, Da:, available 
in systems such as those underlying Lego [15], COQ [8], ALF [17] and also 
the language we use here — Epigram [19]. A common example for illustrative 
purposes is Vect, the family of list types indexed by element type and length: 



data 



A : ★ n : N 
Vect An : -k 



where — 
e 



a : A v : Vect A k 

Vect A 0 a :: v : Vect A (s k) 



S. Berardi, M. Coppo, and F. Damiani (Eds.): TYPES 2003, LNCS 3085, pp. 115—129, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 
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1.1 Programming with Inductive Families 

Function types over inductive families can use specific indices to require and 
ensure properties of inputs and outputs — e.g., compatibility of length: 



let 



u, v : Vect N n 
vAdd u v : Vect N n 



vAdd e e H-i £ 

vAdd ( x :: u ) ( y :: v) i-)- (x + y) :: vAdd u v 



The precise type prevents some bogus choices of output — we can only return e 
on the first line, only a :: on the second. The input possibilities become narrower 
too — adding ( x :: u) to e, or vice versa, is not even an issue. 

By the same token, the potential for optimisation is clear. Once we know 
whether the first argument is e or (x :: u), we can presuppose the form of the 
second argument — we can ignore it in the e case; in the :: case, we can safely 
project out y and v without checking the constructor tag. Moreover, if we inspect 
n, implicitly passed to vAdd, we need never check Vect constructor tags at all. 

We impose invariants on inductive families to improve reliability, but this 
paper seeks to exploit them for performance. Such optimisations are not available 
in conventional functional languages — there is no way that inspecting one 
argument can justify presuppositions about another. If we want to write vector 
addition using ordinary lists, we must not only consider how to handle length 
mismatch in our code, we must also effectively test for it at run-time. 



1.2 Underlying Type Theory 

Following [19], Epigram programs elaborate to well typed terms in a type theory 
TT, based on Luo’s UTT [14] with inductive families [10] and equality as in [18]. 
Here is its syntax: 



t ::= *i (type of types) 

| Vx:t. t (function space) 
| A x\t.t (abstraction) 

| t t (application) 



| x (variable) 

| D (inductive family) 

| c (constructor) 

| D-E (elimination operator) 



As usual, we may abbreviate the function space Va; : S . T by S — > T if x is not 
free in T. There is an infinite hierarchy of predicative universes, We 

leave universe levels to the machine, as in [11]. 

Computation is by /3-reduction for A-abstractions and (.-reduction for elim- 
ination operators. A data declaration typically elaborates to declarations of a 
family D : Vi : I. *, constructors c, and an elimination operator D-E equipped 
with (-rules. We write s <— > t if s /3- or (-reduces to t. In the usual way, every 
well typed TT term t computes to a weak head-normal form WHNF(t) . 

A typical constructor has a type like this: 1 



c : Va : A. D r\ — > . . . — > D rj — > D s 

1 To ease presentation, we keep the non-recursive arguments a to the front and permit 
only first-order recursive arguments — neither restriction is crucial to this work. 
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For example, elaborating our Vect example, we acquire Vect : VT : *. Vn : N. *, 
and constructors: 

e : MA:*. Vect A 0 

:: : M A :*. \/k : N. Va : A. Vv : Vect A k. Vect A (s k) 

Note that the variables left schematic in the data declaration have become ex- 
plicitly quantified arguments. In naive implementations these take up space — 
every Vect A n stores the sequence 0 , . . . , n — 1 , and n references to A . Even with 
perfect sharing, this is quite an overhead — the space implications for families 
with more complex invariants are quite drastic if this problem is left unchecked. 

Basically, the elimination operator, D-E has a type of this form: 

\/i:I.\/x:D i. (indices, target) 

VP : Vi : I. D * — > *. (motive) 

Vm c : Va : A. Vj/i : D r 1 . ... Myy.Drj. \ 

P r i yi —• > ■ ■ ■ —• > P r j Uj ► P s ( c a y). > (methods) 

P i x 

The target, with given indices, explains what to eliminate; the motive ex- 
plains what is to be achieved by the elimination; the methods explain how to 
achieve the motive for each canonical form the target can take, given appropriate 
inductive hypotheses. The associated /.-rules for definitional equality have this 
form: 

r h D-E s (c a y) P m = m c a y (D-E r\ yi P m) . . . (D-E rj yj P m) 

When indices are used uniformly, such as the element type of Vect, we adapt the 
basic D-E slightly, abstracting these parameters once for all. This yields: 

Vect-E : VT :*. Vn :N. Vi» : Vect A n. 

VP : Vn :N. \/v : Vect A k. *. 

Vm e :P 0 (e A). 

Vm r .:\/k:N.\/a: A.Vv.Vect A k. (P k v) —> P (s k) (:: Ak a v). 

P n v 

Vect-E A 0 (e A) P m £ m : . = m e 

Vect-E A (s k) (:: A k a v) P m £ m : : = m-- k a v (Vect-E Ak v P m e m :: ) 

Implementing Vect-E appears to require non-linear matching — there are 
repeated arguments in both patterns, suggesting a run-time conversion check. 
In fact this is not needed — the repeated arguments coincide in any well typed 
application of Vect-E. We do not need to recheck the duplicate A or k in the 
patterns (e 1) or (:: A k a v). So why store them? 

In this paper we show how to streamline the implementation of /-rules so 
that unnecessary testing is avoided. We introduce extensions to the TT syntax 
for marking parts of terms to be ignored or removed. So equipped, we consider 
which constructor arguments can be ignored, and then play a similar game with 
constructor tags. Finally, we show how to eliminate some structures entirely and 
make a larger example smaller — the simply typed A-calculus. 
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1.3 Related Work 

Correctness preserving program transformations [9,22] provide a basis for many 
optimisations in simply typed functional languages. In this paper we use substi- 
tution transformations to mark unused terms for deletion in a similar manner 
to Berardi’s pruning of simply typed A-terms [4] . Program transformation tech- 
niques have also been applied to type theory; Magaud and Bertot [16] show an 
approach to changing data representation by transforming the constructors and 
elimination rule of a family and use this technique to change from unary natural 
numbers to a more efficient binary representation. 

The COQ program extraction tool [21,13] attempts to remove purely logical 
parts of proofs in order to produce executable programs. Our approach differs 
in that we do not separate the predicative and impredicative type universes but 
attempt to remove all terms which are unused. 

Callaghan and Luo [7] use the well-typedness of elimination rules to avoid 
checking of repeated arguments, a technique which we apply and extend in this 
paper. Xi’s DML [25] also uses dependent types for optimisation, eliminating 
dead code [26] and array bounds checking [27]. 

2 Implementing Reduction Rules for Datatypes 

The elimination operator D-E is the only means TT provides for inspecting 
data in the inductive family D. If we optimise D-E’s reduction behaviour, we 
optimise the programs which elaborate in terms of it. Moreover, if any data in 
the representation of D’s elements is not needed by D-E, then it is never needed 
at run-time. Let us look more closely at how t-rules are implemented. 

2.1 Pattern Syntax and Its Semantics 

We implement t-rules D-E ti = e> by pattern matching, marking with [•] those 
parts of patterns p which well typed terms are presupposed to match. Unmarking 
these parts gives back a term, |p|. 

p x (pattern variable) | [i] (presupposed term) 

| c p (constructor pattern) | [c] p (presupposed-constructor pattern) 

For each (,-law, as above, we write a t-scheme, D-E pi i-a e* 
with \p.i\ = ti and e,; a term over p ;’s pattern variables. The (.-schemes are 
then compiled into an efficient case-expression [2]. However, our pattern syntax 
will facilitate the discussion without delving into those details. 

The partial function MATCH tries to compute a matching substitution for 
a pattern and term (matches lifts MATCH to sequences in the obvious way): 

match( x , t ) => (t/x) 

MATCH( c p , t) => MATCHES (p, t) if WHNF(t) c' t and c = c' 

MATCH( [f'] , t.) => ID 

MATCh([c] p, t) => MATCHES (p, t) if WHNF(f) => d t 
MATCHES ( • , • ) => ID 

MATCHES (p p, t t) =>■ MATCH(p, t) o MATCHES(p, t) 
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The first two lines of MATCH test constructors and bind pattern variables as usual 
in implementations of pattern matching from [20] onwards. The remaining two 
lines, however, presuppose the successful outcome of testing. To justify these 
presuppositions, we shall require that each /.-scheme is E-respectful of well 
typed instances, i.e. 

if fh D-Ef : T and MATCHES(j>j, f) => cr then E b D-Eojpj| = D-Ei : T 

A set of /.-schemes, D-E p, i-a e, is / -well-defined if, for any E b D-E t, : 7' of 
the right arity, with a constructor-headed target, we have MATCHES (pj, t ) => cr 
for exactly one i. This yields /.-reduction D-E t i-a aei. A set of /.-schemes 
which is E-respectful and E-well-defined for all E is said to implement the 
corresponding /.-rules. 

2.2 Standard Implementation 

Theorem. For D : Vi : I. ★, with typical c : Va : A. D n D rj — > D s, 

this typical /-scheme implements the /.-rules (the standard implementation): 

D-E [s] (c a y) P m i-a m c a y (D-E r \ y\ P m) . . . (D-E rj y :l P m ) 

Proof. For any T, if r h D-E s' (c a' y') P' m! : T then 

MATCHES([s] (c a y) P m, s' (c a ' y') P' m') => cr 
but matching the other /.-schemes fails, so these schemes are E- well- 

defined. Moreover, cr is (a' / a ; y' /y\ P'/P\ m' /m). Typechecking, c a' y' : 
D (a' / a,\ y' jy)s = D as. Hence as = s' as D-E s' (c a' y') is well-typed. Hence 
our typical scheme is E-respectful. □ 

The standard implementation comments out the indices — just as well, be- 
cause there is no guarantee that they generally take the constructor form which 
explicit matching requires. For example, Vect-E has standard implementation 

Vect-E [A] [0] (e A) P m e m :: i-a m e 

Vect-E [A] [s k\ (:: A k a v) P m e m :: i-A m :: k a v (Vect-E Ak v P m e m :: ) 



2.3 Alternative Implementations 

Where the indices of a constructor’s return type do happen to resemble construc- 
tor or variable patterns, we are free to consider alternative implementations of 
the corresponding /-schemes. We may certainly comment out a pattern variable 
from the target if we can recover it by matching an index. For example, this is 
also an implementation of Vect-E: 

Vect-E AO (e [A]) P m e m :: i-a m e 

Vect-E A (s k) (:: [A] [k] a v) P m e m :: i-A m :: k a v (Vect-E A k v P m e m- : ) 

But we can do better than that. There is no need to check the constructor 
tags on both the length and the target — one check will do. We may take either 
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(f) Vect-E A [0] (e [.A]) P m £ m :: i-a m e 

Vect-E A ([s] k ) (:: [A] [k] a v) P m £ m : - i-A m--k a v (Vect-E Ak v P m £ m :: ) 

or, instead, privileging index length over vector contents 

(|) Vect-E A 0 ([e] [A]) Pm £ m ■■ i-a m e 

Vect-E A (s k) ([::] [A] [k] a v) P m e m.. i-A m : . k a v (Vect-E Ak v P m £ m : .) 

In the sequel, we show how to choose good alternative implementations for 
elimination operators by systematically exploiting the presence of constructor 
symbols in indices. This leads naturally to space optimisations, where we do not 
merely comment out unnecessary data from patterns — we delete them entirely 
from the representation of datatypes. 

2.4 ExTT — an Execution Language for TT with Deleted Terms 

We introduce ExTT, an execution language for terms in TT. ExTT extends TT’s 
syntax with deleted terms and patterns {t}, and also with deleted constructor 
patterns {c }p corresponding to untagged tuples {c} t. We extend the operational 
semantics thus: 

MATCH({f}, {£'}) => ID 

MATCh({c} p, t) => MATCHES (p, t) if WHNF(t) => ({c'} t) 

We are careful to distinguish ({c} {t}), which is represented by the empty tu- 
ple, from {c t}, which is deleted altogether. The actual evaluation of terms in 
ExTT can be by any standard method, such as normalisation by evaluation [1, 
5], compilation to G machine code [12] or program extraction [13]. 

The unmarking operation |-| takes both patterns and terms in ExTT back to 
terms in TT by stripping out both [•] and {•} marks. Terms in ExTT arise only 
by optimisations from well typed TT terms hence ExTT needs no typing rules 
provided that these optimisations are safe. 

We specify an optimisation by giving a substitution |-] from TT identifiers 
to ExTT terms, id by default, together with the optimised ExTT (.-schemes. For 
(.-rules r h D-E ti = e*, these have form D-Ep^ i-a di , where |p,| = tj, |dj| = ei 
and every undeleted free variable in di is a pattern variable in p. ( . For all T, 
these schemes must be T-well-defined in the obvious way, and T-respectful in 
that 

if r h D-E t : T and matches (p^, [t]) => a 

then there exists a substitution r such that fhr |<r(D-E pj)| = D-E t : T 

The role of r is to instantiate the variables free in e*, but deleted in di — these 
are not needed when executing ExTT terms, hence they need not be matched. 
In the following sections, we establish several such optimisations. 

3 Eliding Redundant Constructor Arguments 

Recall the alternative implementation of Vect-E (f above) which matches A and 
k in the indices rather than the target. When can we do this, in general? 
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Whenever c a, c b : D s implies ai = bt, we say that the zth argument of c 
is forceable. eg., the A argument to e is forceable since if £ a, £ & : Vect A 0 
then clearly a = b = A. For A and k are forceable in the same way. 

Constructor arguments which have been commented out owing to their rep- 
etition in a /-scheme are forceable. This is to be expected; such repeated argu- 
ments arise from the patterns describing constructor indices. 

Consider a typical constructor, fully applied to variables, c ay : D s. If 

we express s as \p\ for any patterns p , then any a, appearing as a pattern 
variable in p is forceable, by injectivity of constructors. We call these arguments 
concretely forceable since they can be retrieved in constant time by pattern 
matching on the indices. 

To express s as \p\, we write a program to extract from a term a linear 
pattern with its variable set: 

PAT ( V, x ) => (x U V, x) Ax$.V 

PAT ( V, C t) => (W,LAZY(c,p)) if PATS ( V, t) ==> ( V', p) 

PAT ( V, t )=> (E, [t]) 

PATS( V, ■ )=> (E,-) 

PATS( V, t t) => ( V", p p) 

if pat ( V, t ) => ( V', p ) and PATS ( V', t) => ( V", p) 
lazy( c, [p])=» [cp] 

LAZY( c, p ) [c] p otherwise 

For our typical constructor c, we can extract the patterns which D-E will match 
by PATS (0, s ) =4> (V,p). If an argument a, € V then a* is concretely forceable. 
It is instantiated by matching p , hence we may presuppose it when we match 
the target, yielding the same result. Hence, we may then choose the alternative 
implementation: 

D-E p (c a^ v 1 y) P m i-a m c • • ■ where a\ v 1 => [a] if a € V 

=> a otherwise 

Theorem. The following is an optimisation (forcing): 

for c : Va : A. D r± D rj -> D s where PATS (0, s) => ( V, p) 

take [c] ==> A a; y. c y 

D-E p (c a' v '< y) P m i-a m c a y (D-E r± y\ P m) . . . (D-E r 7 yj P m ) 

where a ^ v ’' r => {a} if a £ V 
==> a otherwise 

Proof. Clearly, |p| = s and |c y | = cay, so if T h D-E s' (ca'y 1 ) P' m! : T 
then, as before, s' = (o'/o; y'y)s. Now, 

matches (p>, (a' / a; y'/y)s) => {a[/ cy if a,; G V) 
matches (c y,c a' y') => (a.'/a/ if a* 0 E; y' /y) 

Hence any matching substitution cr for the left-hand side satisfies 

id |cr(D-E p (c } y) P m) I = D-E s' (c a' y') P' m' 
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So these schemes are T-respectful. They are clearly T-well-defined, as they dis- 
criminate on the target’s constructor. □ 

For our Vect example, forcing is given by: 

[ £ 1 ==*• AA. e {A} 

[::] => A A; k\ a\ v. :: {A} {fc} a v 

Vect-E A [0] (e {A}) P m e m :: i-a m e 

Vect-E A ([s] k) (:: {A} {fc} a v) P m e m :: K > m-.-kav (Vect-E A k v P m e m-) 

In the implementation the deleted arguments really are removed from the now 
fully applied constructors. This is safe because these terms are only decomposed 
by Vect-E which does not expect the deleted arguments. 

4 Eliding Redundant Constructor Tags 

Recall the second alternative implementation of Vect-E (f ) where case selection 
is by analysis of the length index rather than the target itself. 

For which types can we do case selection on an argument other than the 
target? 

If c a,c' b : D s implies c = c', we say that the family D is detaggable. 
Vect is detaggable because the length index determines whether the constructor 
is £ (if the length index is 0) or :: (if the length index is s k). 

For any set of (.-schemes, if the index patterns are already mutually exclusive, 
we can decide which scheme applies without checking the target’s constructor 
tag. The following program checks if two patterns are guaranteed to match dis- 
joint sets of terms: 

DISJOINt( c p , c' q ) => true if c ^ c' 

DISJOINT( cp , cq ) => 3z. DISJOINT^, Ip) 

DISJOINt( [c] p, [c] q) => 3z. DISJOINT^, qi) 

DISJOINt( p , q )=> false otherwise 

Of course if we are to match on the indices then we must actually examine 
their constructors, so the previous lazy definition of PATS is not sufficient. We 
compute the patterns we need for this optimisation with epats — the same as 
PATS but with LAZY replaced by EAGER: 

EAGER( c, p) => c p 

Given a family D with constructors c, : Mx : X, .Ds j where epats( 0, s,) => 
(Vi, pi), we say D is concretely detaggable if 

V i ± j. 3k. disjoint (p ik , p jk ) => true 
Theorem. We may optimise (detag) such a concretely detaggable D thus: 

[q] => Ax. {Ci} 

D-E pi ({q} xVA) p m i— >. ei 
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Proof. These schemes are T-respectful for all r by the same argument as for 
forcing — the switch to eager patterns does not affect the set of variables matched 
from the indices, nor the success of matching well-typed values. Deleting the 
constructor in the target can only improve the possibility of a match, but the 
disjointness condition directly ensures that the schemes remain T- well-defined. 
□ ' 



For our Vect example, detagging is given by: 

[ £ 1 ==*• {e} {A} 

[::] => AA; k\ a; v. {::} {A} {fc} a v 

Vect-E A 0 ({e} {A}) P m e m :: i-t m e 

Vect-E A (s &)({::} {A} {k} a v) P m e m- ■ i-T m :: k a v (Vect-E Ak v P m e m :: ) 

We achieve this space optimisation at the cost of using eager rather than 
lazy patterns. The number of constructor tests required increases by a constant 
(possibly zero!) factor and indices may sometimes be computed where they would 
previously be ignored. Clearly a real implementation would minimise the number 
of eager patterns required to make the distinction. An analysis of this space/time 
trade-off is beyond the scope of this paper, but for Vect it seems likely to be 
worthwhile since we have swapped one constructor test for another. 



5 Run-Time Optimisation 

In our Vect-E example, we have already deleted both e and its argument. We 
might be tempted to go a step further, and comment out that entire target. 

Vect-E A 0 [{e} {A}] P m e m. : i-t m e 

However, this /-scheme is not respectful and breaks subject reduction thus: 

... ; x : Vect A 0 h Vect-E A 0 x P m e m :: : P 0 x 

i-T m e : P 0 e 

The pattern ({e} {A}) may not test tags or extract arguments, but it still only 
matches targets whose weak head-normal forms are constructor applications. 
The optimisations we have seen thus far are safe to use in any context, and we 
need to reduce under binders when performing the equality checks which ensure 
that Epigram programs elaborate to well typed terms. 

However, at run-time, we can employ a much more restricted notion of com- 
putation, reducing only in the empty context, £. In this scenario, we can exploit 
the adequacy property of TT - if £ h t : D s then WHNF(t) is c t for some t 
— to gain further optimisations, not available in a general context. 

In effect, we may employ weaker criteria for alternative implementations of 
elimination operators in run-time execution. We say that a run-time optimi- 
sation is given by a substitution and /.-schemes in ExTT as before, except that 
these schemes need only be ^-respectful and £-well-defined. 
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The adequacy property tells us that the target will always match a construc- 
tor pattern at run-time, hence we may safely presuppose a pattern from which 
no information is gained, as suggested above. Moreover, by applying this obser- 
vation inductively, we can sometimes extract another, more drastic optimisation 
from the guarantee of adequacy at run-time. 



6 Collapsing Content-Free Families at Run-Time 



Consider the less than or equal relation, declared and elaborated as follows: 



data x, y ' ^ where 

x<y : * 



leO : 0 <y 



p : x<y 
leSp : sx<sy 



< : N -» N ->• * 

leO : Vy :N. < 0 y 

leS : Vx, y :N. < x y — > < (sx) (s y) 

The < family describes a property of its indices and stores no other data. 
It is not surprising therefore to find that much of its content can be deleted. 
Forcing and detagging yield: 

[leO] => A y. ({leO} {y}) 

[leS] =4- Ax; y; p. ({leS} {x} {y} p) 

<-E 0 y ({leO} {y}) P m\ e0 m ]eS m\ e0 y 
<-E (sx) (s y) ({leS} {x} {y} p) P m\ e0 m\ eS 

H> mies x y p (<-E x ypP m\ e o m i e s) 

Now we are left with only one undeleted argument, the recursive p in leS. 
This argument serves two purposes — firstly it is the target of the recursive call 
and secondly it is passed to the method ?U| e s ■ We might think that p can also be 
elided — ultimately it can only by examined by <-E which, by induction, can 
be shown never to examine it. In a partial evaluation setting, however, where we 
may reduce under binders, we must at least check that the target is canonical for 
reduction to be possible. If not, we run the risk of reducing a proof of something 
which cannot be constructed, such as 5<4! 

At run-time, on the other hand, we never need to check that p is canonical 
because the adequacy property tells us that it must be. Hence, at run-time, we 
no longer need to store the recursive argument — the entire family collapses: 

[leO] => Ay. ({leO y}) 

[leS] => Ax; y\ p. ({leS x y p}) 

[<-E] => Ax; y; p\ P; m\ e0 ; m | eS . <-E x y {p} P m\ e0 m\ eS 

For which families can we do this run-time optimisation? If a, b : D s implies 
a = b we say that the family D is collapsible. < is collapsible because any value 
in x<y is determined entirely by the indices x and y. 

<-E 0 y {leO y} P rri\ e o m i e s m\ e o y 

<-E (sx) (s y) {leS x y p} P m\ e o m i e s 

mies x y ({p}) (<-E xy {p} P m\ e0 m\ eS ) 
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We say a family is concretely collapsible if it is detaggable and for each 
constructor c : Va : A. D r*i — > ... — > D r j — > D s, EPATS (0,s) => ( a,p ). 
That is, the constructor tag and all the non-recursive arguments are cheaply 
recoverable from the indices. 

Theorem. We may optimise a concretely collapsible family at run-time : 

D-E p {c a y} P m 

m c a ({yi}) . . . ({y n }) (D-E n {yi} Pm) ... (D-E r n {y„} P m) 

[c] => A a; y. ({c a y}) 

[D-E] => Xi- X] p\ m. D-E * {x} P m 

Proof. These schemes are £-well-defined by the same argument as for cletagging. 
They are ^-respectful because the only possible left-hand sides have the form 
£ F D-E s' (c a' y') P' m', hence, by disjointness, the only possible match, even 
with the target deleted, is with the scheme for c, with matching substitution 
a = (a' / a; P' /P; m' /m) , binding all the undeleted free variables on the right- 
hand side because EPATS (0, s ) => ( a,p ). Taking r = (y' /y), we see that 

£ h t |cr(D-E p {c a y} P m) \ = D-E s' (c a ' y') P' m' 

hence these schemes are ^-respectful. □ 



7 Examples 



7.1 The Finite Sets 



The finite sets, indexed over a natural number n, are a family of types with 
n elements. Effectively, they are a representation of bounded numbers and are 
declared as follows: 



data r'! 1 • ^ where 
Fin n : * 



i : Fin n 

fO : Finsn f s i : Fin sn 



The forcing optimisation elides the indices from the elaborated constructors: 

[fO] ==> An. fO {n} 

[fs] => An; i. fs {n} i 

After stripping the forceable arguments, the shape of the resulting type 
matches that of N — that is, the base constructor takes no arguments and 
the step constructor takes a single recursive argument. In principle, any optimi- 
sations which apply to N such as Magaud and Bertot’s binary representation [16] 
should also apply to Fin. We hope to recover Xi’s efficient treatment of bounded 
numbers in this way [27] and perhaps extend it to other forms of validation. 
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7.2 Comparison of Natural Numbers 



The Compare family from [19] represents the result of comparing two numbers, 
storing which is the greater and by how much: 



data 



m, n : N 

Compare m n : * 



where 



It y : Compare x (x + (s y)) 



eq : Compare a; a; 
gt x : Compare (y + (s x)) y 



Compare is an example of a family which is collapsible, but not concretely 
collapsible. Clearly there is only one possible element of Compare m n for each 
m and n, and given this element we can extract their difference in constant 
time. If we were to collapse Compare we would replace this simple inspection 
by the recomputation of the difference each time the same value was used. We 
restrict concretely collapsible families to those where the recomputation of values 
is cheap. Nonetheless, by forcing, Compare need only store which index is larger 
and by how much: 

[It] => Ax; y. It {x} y 
[eq] => Ax. eq {x} 

[gt] => Ax;y.gtx{y} 



7.3 Accessibility Predicates 



In [6], Bove and Capretta use special-purpose accessibility predicates to prove 
termination of general recursive functions. For example, quicksort terminates 
on the nil, and it terminates on cons a; xs if it terminates on filter (< x) xs and 
filter (> x) xs. This is expressed by the qsAcc predicate below: 



data 

where 



l : ListN 
qsAcc l : * 



qsNil : qsAcc nil 



qsl : qsAcc (filter (< x) xs) qsr : qsAcc (filter (> x) xs) 
qsCons qsl qsr : qsAcc (cons x xs) 



quicksort itself is defined by induction over qsAcc, so a naive implementation 
would need to store the proofs. However, qsAcc is concretely collapsible: 

[qsNil] => {qsNil} 

[qsCons] => Aa;; xs; qsl; qsr. {qsCons x xs qsl qsr} 

Collapsing replaces computation over qsAcc by computation over its indices, 
restoring the intended operational semantics of the original program! These ac- 
cessibility predicates are concretely collapsible because their indices are con- 
structed from the constructor patterns of programs. 
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7.4 The Simply Typed A-Calculus 

We define the simply typed A-calculus in a similar fashion to [19], making ex- 
tensive use of inductive families to specify invariants on the data structures. We 
begin with STy, representing simple monomorphic types: 

data where ^ s,t . STy 

STy : * l : STy s => t : STy 

We represent contexts by Vects of types, Ctx = VectSTy. The explicit length 
allows a safe de Bruijn representation of variables, via the Fin family, hence our 
untyped terms, Expr, are at least well scoped — the length is forceable for each 
constructor: 



data 



n : N 

Expr n : * 



where 



i : Fin n 
eVar i : Expr n 



S : STy t : Expr s n f,s : Expr n 
eLam S t : Expr n eApp fs : Expr n 



The Var relation gives types to variables. VarGz T states that the zth member 
of the context G has type T . Clearly Var is concretely collapsible. 



data 

where 



G : Ctxn i : Fin n T : STy 
Var G i T : * 

v : Var G i T 

stop : Var (S::G) fO S pop v : Var (S::G) (fs i) T 



Finally, we have the well typed terms, indexed over contexts, the original 
raw terms and types. This gives us a particularly safe representation — no 
typechecker can return the wrong well typed term. This indexing also enables 
us to synchronise terms safely with value environments during evaluation in the 
style of Augustsson and Carlsson [3] . 



data 



G : Ctx n e : Expr n T : STy 
Term G e T : * 



where 



v : Var G i T 

var v : Term G (eVar i) T 



b : Term {S::G) e T 
lam b : Term G (eLam S e) (S => T) 



f : Term G fe (S => T) a : Term G ae S 
app fa : Term G (eApp/e ae) T 



Term seems to involve a horrifying amount of duplication. Fortunately, many 
of the arguments are forceable and thanks to the indexing over raw terms, Term 
is detaggable. After optimisation, this is all that remains: 

|var] => An; G;i\T; v. {var} {n} {G} {*} {T} {n} 

[lam] =» An; G\ S ; e; T: b. {lam} {n} {G} {5} {e} {T} b 

[app] => An; G; fe; 5; T; /; ae; a. {app} {n} {G} tfe} S {T} f {ae} a 

The only non-recursive arguments which survive are the domain types of ap- 
plications. Typechecking thus consists of ensuring that these can be determined. 
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8 Conclusions and Further Work 

The ideas presented here have been tested in a prototype implementation. Exe- 
cution in this system is by extraction to a Haskell coding of ExTT values without 
the deleted subterms. We have used the GHC profiling tools [24] to assess the 
space usage of programs. 

Our experiments show a significant reduction in space requirements over a 
naive implementation particularly where there is extensive indexing. For vector 
operations, a 10-20% saving in memory usage is typical (depending on the length 
of the vector), but for the typechecker, a saving of over 80% has been observed. 

Although remarkably straightforward, these optimisations only present them- 
selves when one takes dependency typed programming seriously. The forcing 
optimisation largely overcomes the space penalty of adopting dependent types, 
but detagging derives new dynamic benefit from previously unavailable static 
information. Collapsing, too, has significant consequences, deleting accessibility 
arguments and all the equational reasoning from run-time code, not because we 
deem them to be proof-irrelevant, but because they actually are. 

We suspect that these optimisations are the first of many. For example, as 
we erase forceable indices, it is worth identifying operations which affect nothing 
else, such as weaken : Fin n — > Fin (sn), which embeds a value in a higher 

indexed set — this is effectively the identity function. This optimisation applies 
wherever functions exist only to manage invariants. 

We might also consider the low level implementation of high level types, such 
as the natural numbers. By replacing N-E with an appropriate elimination rule 
for unbounded binary numbers [16] we can achieve a significant speed-up. Any 
other data structure with the same shape after optimisation, eg. Fin, can be 
treated similarly. In a practical implementation, such optimisations are essential 
for comparable performance to its conventional counterparts. 

Optimisation of a new language with a new type system naturally presents 
new problems and new opportunities. While we can never hope to produce a 
completely optimal program in all cases, this research leads us to believe that 
the presence of much more static information can only give us greater scope for 
optimisation in both time and space. 
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Abstract. The paper presents the system of named modules imple- 
mented in Coq version 7.4 and shows that this extension is conservative. 
It is also shown that the implemented module system is ready for the 
future planned extension of Coq with definitions of functions by means 
of rewrite rules. More precisely, the impact of the module system on the 
acceptance criterion for rewrite rules is carefully studied, leading to the 
formulation of four closure properties that have to be satisfied by the ac- 
ceptance criterion in order to validate the conservativity proof. It turns 
out that syntactic termination criteria such as Higher Order Recursive 
Path Ordering or the General Schema can be adapted to satisfy these 
closure properties. 



1 Introduction 

Computer aided theorem proving has become an important part of modern theo- 
retical computer science. Various proof assistants gain more and more popularity 
and industrial size problems begin to be addressed. To become applicable in the 
industry, such systems must provide ways to structure large developments and 
a high degree of automation. 

In this paper, we concentrate on the Coq proof assistant [5], a system that 
is based on the so-called Curry-Howard isomorphism, relating logical formulas 
to types and their proofs to terms inhabiting these types. The logical formalism 
implemented in Coq is an extension of the calculus of constructions of Coquand 
and Huet [7] with inductive types [8] and a predicative hierarchy of universes [14]. 
Coq is actively developed for more than 10 years now and is used both for 
fomalizing mathematics (for example the fundamental theorem of algebra) and 
for program verification (data structures, telecommunication protocols, Javacard 
platform etc.) 

In order to address the large development issues, proof developments in Coq 
may be divided into files and an ML-style module system has recently been im- 
plemented [3] . The latter allows clear modelisation of parametrized theories and 
certified data structures and their convenient instantiation, therefore encourag- 
ing proof reuse. 

The automation issues are being addressed by various tactics, but the main 
show-stopper is the treatment of equations. Even though equational reasoning 
is so common in mathematics it is still quite difficult to use it in Coq, especially 
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compared to its competitors such as Isabelle [16] or PVS [15]. For that reason 
the inclusion of an efficient term rewriting engine is also planned for Coq. 

In order to make rewriting as efficient as possible it is desirable to include it 
in the internal conversion relation of Coq. In this way, arbitrarily long reduction 
sequences would result in very small proof-terms. The other possibility of using 
rewriting based on the Leibniz equality is much more space- and time-consuming, 
as each rewriting step generates a fragment of the proof-term proportional to 
the size of the original goal. 

Since the logical correctness of Coq depends on good metatheoretical proper- 
ties of conversion, like confluence and strong normalization of reduction rules, we 
have to preserve these properties when extending conversion with user-defined 
rewrite rules. Since these properties of term rewriting systems are undecidable 
in general, the research in this area concentrates on incomplete acceptance crite- 
ria that guarantee termination and confluence and are flexible enough to accept 
most rewriting systems already known to be terminating and confluent [1,18]. 

The goal of this paper is to prove that the modular extension of Coq is 
conservative and that it will remain so after rewriting is added to Coq. We 
study the impact of the module system on acceptance criteria for rewrite rules 
and formulate closure properties that have to be satisfied by the criteria in order 
to make the conservativity proof work. The closure properties turn out to be 
compatible with syntactic acceptance criteria available in the literature. 

In the type theory setting ML-style modules were first introduced in 1992 
in the LF logical framework [12]. Later Courant [9,10] proposed a module cal- 
culus suitable for pure type systems, an example of which is the calculus of 
constructions and many other known type theories. In the calculus of Courant 
modules are second-class anonymous objects, reductions on modules have sub- 
ject reduction and termination properties and therefore the modular extension of 
a given pure type system is conservative. 1 Thanks to the consequent use of mod- 
ule interfaces, Courant’s module calculus allows smooth composition of libraries, 
separate checking of dependent parts of large proof developments and guarantees 
correctness of proofs upon a conservative upgrade of their components. 

Unfortunately, given currently available acceptance criteria, definitions by 
rewriting cannot be integrated in the calculus of anonymous modules. Indeed, 
reductions on modules would impose very strong closure properties on accep- 
tance criteria, such as closure by arbitrary substitutions, which does not hold 
for any of the available acceptance criteria. 

The module system implemented in Coq is therefore a syntactic restriction of 
the calculus of Courant, allowing to use existing acceptance criteria for rewrite 
rules, without sacrificing good properties of the module system mentioned above. 
The proof of conservativity is changed accordingly: instead of relying on arbi- 
trary reductions on anonymous modules we chose a particular reduction strategy 
preserving syntactic constraints, trying to make closure properties imposed on 
acceptance criteria as weak as possible. 

1 Courant also writes that his proof would still be valid if anonymous inductive types 
were added to the pure type system. 
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We start our presentation by formalizing pure type systems with genera- 
tive definitions, a compromise between anonymous inductive types and global 
rewriting systems. On one hand, in theoretical papers inductive types are usually 
treated as anonymous entities [8,20], even though in Coq they are implemented 
as named. On the other hand, rewriting is usually given in a form of one global 
rewriting system which is treated as a parameter of typing rules. This approach 
is of course incompatible with interactive proof assistant, where users are used 
to construct their proofs and definitions gradually. However, known syntactic 
acceptance criteria are modular and allow to extend the global set of rewrite 
rules by entering one accepted rewriting system after another. 

In our formalism, definitions by rewriting and inductive definitions are en- 
tered in the global environment, and they can be accessed in terms through the 
names assigned to them by the environment. 

Next, this formalism is extended with a system of named modules and the 
generative definitions now become part of modules and module interfaces. We 
show, that a potential proof of False in the calculus with modules can be trans- 
formed into a proof of False in the calculus without modules. The closure prop- 
erties on acceptance criteria for definitions by rewriting result from the analysis 
of this transformation. We conclude the paper by an argument that these prop- 
erties are satisfiable. 

2 Pure Type Systems with Generative Definitions 

We present here a formalization of a pure type system with generative inductive 
definitions and generative definitions by rewriting, which is quite close to the 
way these elements are and will be implemented in Coq. The formalism is built 
upon a set of PTS sorts S , a binary relation A and a ternary relation 7 Z over S 
governing the typing rules (Term/Ax) and (Term/Prod) respectively (Fig. 4). 
The syntactic class of pseudoterms is defined as follows: 

e,t ::= v | s | (ei e^) | A v.t.e \ IIv.t1.t2 

The difference between e and t is only intuitive, to help the reader distinguish 
between a role of a term e and a type t, but both these letters denote elements 
of the syntactic class of pseudoterms. A pseudoterm can be a variable, a sort 
from <S, an application, an abstraction or a product. 

Inductive definitions and definitions by rewriting are stored in the environ- 
ment and used in terms only through names assigned to them by the environ- 
ment. Therefore an environment is a sequence of declarations, each of them 
being a constant definition v : t := e, a variable declaration v : t, an inductive 
definition lndDef(Fi / := E c ), where E c and E 1 are environments of (possi- 
bly mutually defined) inductive types and their constructors, or a definition by 
rewriting RewDef (E r ,R), where E r is an environment of (possibly mutually 
defined) function symbols and R is a set of rewrite rules defining them. Envi- 
ronments E 1 , E c in inductive definitions and E r in definitions by rewriting 
contain only variable declarations. We assume that names of all declarations in 
environments are pairwise disjoint. 
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Definition 1. A pure type system with generative definitions is defined by the 
typing rules in Fig. 1, 2, 3 and f. The relation « used in the rule (Term/Conv) 
is the congruence generated by the sum of beta, delta and rewrite reductions. 

As in [2,19] elimination of inductive types is supposed to be expressed by rewrit- 
ing. 

The rules for correctness of definitions contain side-conditions. The side con- 
ditions POS e{E t := E c ) stands for a positivity condition on inductive defini- 
tions as given for example in [8,20]. The condition ACC e(E f ,R) stands for an 
acceptance condition as given for example in [2,19]. Both conditions are meant 
to assure the decidability of type-checking in any correct environment. 

Consider the following sequence of declarations in the Coq-like syntax: 2 

Inductive nat : Set := 0 : nat I S : nat — > nat . 

Symbol plus : nat — > nat — > nat 
Rules 

plus 0 y — >■ y 

plus (S x) y — > S (plus x y) 

plus x (plus y z) — > plus (plus x y) z 

plus x 0 — > x 

plus x (S y) — > S (plus x y) . 

Definition two : nat := plus (S 0) (S 0). 

Parameter n : nat . 

It can be interpreted as an environment E consisting of the inductive definition 
of natural numbers, symmetric definition by rewriting of addition, the definition 
of the constant two and the declaration of a variable n of type nat. Assuming 
that the definition of natural numbers satisfies the condition P0S() and the 
definition of addition satisfies the condition ACCQ, we can derive the judgment 
E b ok using the rules in Fig. 1. 

The definition of logical consistency for a calculus with generative definitions 
has to be more involved that the usual requirement that False is not inhab- 
ited in the empty environment. Indeed, the latter formulation does not account 
for generative definitions at all. Therefore we define logical consistency by the 
requirement that False is not inhabited in any closed environment, i.e. an en- 
vironment without variable declarations and where all functions are total. 

The requirement that functions defined by rewriting are total could very well 
be included in the condition ACC(). However, we decided to assume the exis- 
tence of a separate condition on definitions by rewriting, called COMP e{E f ,R) 
for completeness , that is satisfied only if all functions from E r are total, i.e. 
completely defined by R in the environment E. 

The separation between ACC() and COMPQ is motivated by the idea of 
working with abstract function symbols, equipped with some rewrite rules not 
defining them completely. For example if plus were declared using only the third 
rule from the system give above, we could develop a theory of an associative 

2 The syntax of the definition by rewriting is inspired by the experimental “Rewriting” 
branch of Coq developed by Blanqui. For the sake of clarity we omit certain details, 
like the environment of rule variables. 
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e h ok 




Eh ok E h t : s 


Eh ok Ehf:s Ehe:i 


E; v : t h ok 


E\ v : t := e h ok 


Eh ok Eh IndDe^F 7 := E c ) : correct 


1 ? h ok Eh RewDef (E r , R) : correct 


E ; IndDe^F 7 := E c ) h ok 


E\ RewDef (E r , R) h ok 



Fig. 1 . Environment correctness 




Fig. 2 . Correctness of definitions 




Fig. 3 . Enviroment lookup 



(Term/Prod) 

E \~ t\ : si E\ v \ t\ \~ t2 ’• $>2 


(Term/Abs) 

E;v : tih e : t2 


(Term/ Ax) 

E h Ilv.ti. t2 ■ s E h ok 


E h Ilv.ti. t-2 : S3 


Eh \v.ti 


e : Ilv.ti. t2 E b si : S2 


where (si,S2,S3) G 1 Z 




where (si,S2) G A 


(Term/App) 


(Term/Conv) 


Ehe: Ilv.ti. t2 


Eh e' :t\ Eh 


e:t Eh t' : s E h t ~ t' 


Eh e e' : 1 2{v r-t e'} 


Eh e :t' 



Fig. 4 . PTS rules 
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function over natural numbers. The usefulness of abstract functions equipped 
with rewrite rules will become clearer after the introduction of modules, because 
they will allow us to instantiate abstract elements with concrete ones. 

Definition 2. A correct environment E is closed if it contains no variable dec- 
larations and if all definitions by rewriting are complete, i.e. every time E can 
be split into E\\ RewDef(if r , R); E 2 the condition COMP^ (E r , R) is satisfied. 



Definition 3. A pure type system with generative definitions is logically con- 
sistent if False is not inhabited in any closed environment. 

Summarizing, conditions ACCQ and POS() are supposed to guarantee that typ- 
ing in every correct environment is decidable and COMPQ is supposed to guar- 
antee that False is not provable in any closed environment. 

The problem of consistency for the calculus of constructions with rewriting 
has not been very well studied yet. The only condition that we know that guar- 
antees consistency is given in [2]. Please see the discussion of this question at 
the end of Sect. 5. 

3 Calculus of Named Modules 

In this section we define a calculus built on top of a pure type system, containing 
inductive definitions, definitions by rewriting and a system of named modules. 
The latter is an adaptation of an ML-style module system with the following 
usual features: 

Structures bundle together related definitions and lemmas. They correspond 
to records. 

Signatures - module types of structures - play the role of interfaces. They 
correspond to record types with manifest fields [13,11]. 

Functors are parametric modules. Their application to modules makes instan- 
tiation very convenient. They correspond to dependent functions. 

Functor types - module types of functors - correspond to dependent products. 
Higher order functors are functors taking parameters which are themselves 
functors. 

Subtyping allows to apply functors to modules with more components. It is 
monotone for signatures and contravariant in the argument type and covari- 
ant in the result type for functor types. 

Nested modules - modules can be components of stuctures and signatures. 

The reader is invited to consult Sect. 1.5.1 of [4] for simple examples of all 
these features. Below we present the details of the calculus of named modules. 
We deliberately ignore the problem of name clashes in nested structures and 
signatures, referring the reader to [4] again for details. 
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Access paths p 


:= v | p.v 


Terms e,t 


:= p | s | (ei e?) | A v.t.e \ Ptv.t 1 .t 2 


Blocks B 


:=p— Rew (E r , R) — lnd(F J := E c ) 


Block types BT : 


:= RewSpec (E r , R) — IndSpeclE ’ 7 := E c ) 


Rewriting systems R 


:= Ei : li ^ n : ti . . . E n : l n ^ v n : t n 


Modules 




m ::= p | Struct Vi : Si := Pi . . . v n '■ S n '■= Pn End | Functor[c:M] m \ (pi P 2 ) 


Module types M 


:= Sig ci : Si . . .v n : Sn End | Funsig(v:Mi) M 2 


Implementations P 


:= e | B | m 


Specifications S 


:= Ty(i) | Eq(e : t) \ Ty (BT) \ Eq (p : BT) \ Ty(M) | Eq(p : M) 


Specification sorts SS : 


:= modtype | blocktype | spec 


Environments E : 


:= e ci : Si . . .v n ■ S n 



Fig. 5. Syntax 



Specifications. In order to avoid giving many sets of similar rules, for example 
for extracting information from the environment and for extracting information 
from module signatures, we decided (after Courant) to introduce the auxiliary 
notion of specifications. They allow to factorize extraction rules into one rule 
for extracting a specification from the environment, one rule for extracting a 
specification from a signature and one set of rules to extract information from 
a specification. Specifications are also used to factorize rules for environment 
creation, typing rules for structures and subtyping rules for signatures. 

There are two kinds of specifications: abstract Ty(#) and manifest Eq(^> : <P). 
Using specifications, a variable declaration is written v : Ty(<Z>), which means 
that v has type and a constant declaration is written v : Eq (ip : 4>), meaning 
that v is of type and equal to ip. 

This way an environment can be uniformly presented as an assignment of 
specifications to names. Inductive definitions and definitions by rewriting are 
also tailored to match this framework, but this will be explained later. 



Structures and Signatures. From the point of view of typing, signatures corre- 
spond to fragments of correct environments (see the rule (Sig/Form) in Fig. 7) 
and structures correspond to fragments of closed environments. 

Like environments, signatures assign specifications to names. Structures as- 
sign specifications and implementations to names. The typing rule for structures 
(Sig/Struct) explains the role of these elements: every implementation satisfies 
its corresponding specification, but implementations are not used to type-check 
subsequent components. 

This gives us a nice way to formally distinguish between lemmas with proofs 
and definitions of constants. A lemma is represented as a triple v : Ty(t) := e 
where v is its name, t its formulation and e its proof, which is not used to type 
subsequent components. A constant definition is a triple v : Eq(e : t) := e, where 
the equality v ~ e can be deduced from its specification and hence used to type 
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subsequent components. Even though the necessity of duplicating e may look 
strange, advantages of using specifications are worth it. 

Blocks and Block Types. Apart from lemmas and constant definitions, structures 
can also contain inductive definitions and definitions by rewriting (and also mod- 
ules themselves). Suppose in is a structure containing the first three definitions 
given in Sect. 2, i.e. the inductive definition of natural numbers, the function 
plus defined by rewriting and the definition of two. If we bind m to a name, say 
A, we expect to be able to use the elements of in, prefixed by A. In particular 
A.O and A.S should both be constructors of the inductive type A.nat and the 
rewrite rule A.plus A.O x — > x should be available in the conversion. 

Since we want our module system to allow separate checking of modules, the 
extraction of information from modules should be based on module types and 
not on module expressions. Therefore it is necessary to put information about 
inductive definitions and definitions by rewriting inside signatures. In order to 
enforce the parallel between structures and closed environments, rewriting in 
structures should be subject to both ACC() and COMP() conditions, and rewrit- 
ing in signatures - just to ACC(). 

This is exactly the case in our system. We decided to split RewDef {E r ,R) 
into two constructions RewSp ec(E r ,R) and Rew (E r ,R), the first being an el- 
ement of specifications and subject to the condition ACC (E r ,R), and the lat- 
ter an implementation and subject to the condition COMP(E r , R) (see rules 
(RewSpec/Form) and (RewSpec/Rew) in Fig. 10). 

The reason why the syntactic class of specifications does not directly contain 
RewSpec(E r , R ) comes from considering the operation of module renaming: 

Module B : =A . 

After that operation, the module type of B should reflect both the fact that the 
plus component is a symbol defined by rewriting and the fact that it is equal to 
A.plus. Since we want to have the principal types property for module types, we 
decided to wrap RewSp ec(E r ,R) into specifications Ty() and Eq(). 

Inductive definitions are treated similarly: we also split lndDef(If 7 := E c ) 
into lndSpec(A 7 := E c ) and lnd(E 7 := E c ), but the rule (IndSpec/Ind) for 
typing lnd(I? / := E c ), has no additional side-condition compared to the rule 
(IndSpec/Form) from Fig. 9. 

Note that inductive definitions and definitions by rewriting both have a po- 
tential to mutually define many symbols. Therefore we decided to create a new 
syntactic class of blocks, denoting a possibly mutually defined sequence of terms. 
There is also a whole typing hierarchy for blocks. Next to terms, types, sorts and 
term specifications, we introduce blocks, block types, the sort blocktype and 
block specifications. A similar hierarchy is also available for modules: there are 
module expressions, module types, the sort modtype and module specifications. 

Note that the rules (IndSpec/Form) and (RewSpec/Form), corre- 
spond to rules from Fig. 2 and a sequence (Ty/Form) or (Eq/Form) 
followed by (Env/Insert) corresponds to environment insertion rules from 
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(Env/Empty) (Env/Insert) 

Eh ok E b S : spec 


(Env /Lookup) 

Ei ; v : S\ E 2 I - ok 




eh ok E-,v:S\- ok 


E i; v :S]E 2 \~v:S 



Fig. 6. Environment 

Fig. 1. Similarly, a sequence (Env/Lookup), followed by (Ty/Type) or 
(Eq/Type) or (Eq/Comp), followed optionally by (IndSpec/IndType), 
(IndSpec/Constr), (RewSpec/Fun) or the rule (RewSpec/Comp) cor- 
respond to environment lookup rules from Fig. 3. 

Calculus of Named Modules. Our calculus contains a module system which is 
a syntactic restriction of the module calculus of Courant. Indeed, the syntactic 
class of terms does not contain the general selection operator m.v , where m is 
a module expression, but only its restriction to so-called access paths [13]. The 
same restriction also appears in module and block manifest specifications and in 
module application. 

Definition 4. The Calculus of Named Modules is defined by the typing rules 
on Fig. f , 3 6, 7, 8, 9, 10 and 11. 4 The relation ~ appearing in the premises of 
the rules (Term/Conv) and (Block/Conv) is the congruence defined by beta 
reduction and rides (Eq/Comp) and (RewSpec/Comp). 



4 Examples 

Now we can show the representation expressed in our formal abstract syntax of 
the binding Module A : =m, where m is the structure mentioned in the previous 
section: 

A : Ty(. . . ) := Struct 

nat,0,S : Ty(lndSpec(£' at := E% at )) := Ind (E 1 ^ := E% at ) 
plus : Ty(RewSpec {Efi at ,R nat )) := Rew(E^ lus , R plus ) 
two : Eq (plus ( S O ) (S O) : nat ) := plus ( S O ) ( S O ) 

End 

where E £ ot = nat : Ty(Set), E% at = O : Ty{nat)\S : Ty (nat -T nat), E^ lus = 
plus : Ty (nat —> nat — > nat) and R p i us is the system with 5 rules from Sect. 2. 
Note that nat (and other names as well) appears three times here: first as a name 
of a structure component, second as a local binder inside lndSpec(E^ Qt := E% at ) 
and third as a local binder inside lnd(E^ Qt := E% at ). Again, the duplication here 
is only needed to make the presentation with specifications possible. 

3 To be 100% formal, in the premises of rules (Term/Prod) and (Term/Abs) the 
environment should be written E; v : Ty(fi). 

4 The framed rules form a subsystem called principal, which is used in Sect. 5. 
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(Sig/Form) 


(Sig/Access) 




E\ vi : Si . . . v n : S n b ok 


E b p : Sig vi : Si . . .v n : S n End 




E b Sig Vi : Si . . .v n '■ S n End : modtype 


E b p?Vk : Sk{vi n- ]Pvi}i = i...k-i 




(Sig/Struct) EhSig57i:5l ...^ :6 


n End : modtype 


E\v i : S i . . . Ufc-i : h Pi : St for k = 1 . . . n 


E b Struct vi : Si := Pi .. .v n : S n '■= Pn End : Sig Vi : Si . . . v„ : S n End 


(Sig/Sub) . _ 

E h Sig Vi : Si . . ,v n : 


S n End : modtype 


E h Sig vi : S'i . . . v' n , : S 


' n , End : modtype 


E\ vi : Si . . . v n : S„ h vj. : 


S'k for k — 1 .. .n' 


E b Sig vi : Si . . . v n : S n End <: Sig v{ : S[ . . . v' n ' : S' n , End 


(Funsig/Form) 


(Funsig/Functor) 


E h Mi : modtype 


E\ v : Ty(Afi) b m : M 2 


E\ v : Ty(APi) b AP 2 : modtype 


E b Funsig(n:Afi) M 2 : modtype 


E b Funsig(n:A/i) M 2 : modtype 


E b Functor[-y:APi] m : Funsig(u:Afi)Af 2 


(Funsig/Sub) 


Mod/App) 


E b M[ <: Mi E- v : Js/{M[) b M 2 <: M' 2 


E b p : Funsig(u : AA) M 2 E b p 1 : 


Mi 


E b Funsig(v:Mi) M 2 <: Funsig(u:Afj) M% 


E b p p' : M2{v 1 — ^ p'} 


(Subsumption) (Strengthenning) 


E b m : Mi E b Mi <: M 2 E b p : M E b M : modtype 


E\~ m : M 2 


E b p : M/p 


Sig Vi : Si .. .Vn : s n End/p = Sig Vi : Si/p.vi\ v n : S n /p.v n End 


(Funsig(t>:APi) M 2 )/p = Funsig(u:Afi) M 2 


Ty (t)/p = Eq(p : t) 


Eq(e' : t)/p = Eq(e' : t ) 


Ty(BT)/p = Eq(p : BT) 


Eq {p’-.BT)/p = Eq {p’-.BT) 


Ty (M)/p = Eq(p : M/p) 


Eq(p' : M)/p = Eq(p' : M/p) 



Fig. 7. Modules 



Symbols <j>, <?, 


S are either e, t, s, or p, M, modtype, or p, BT, blocktype. 


(Ty/Form) 


(Ty/Sat) 






(Ty/Type) 




Eh 9 : S 


fib P : E 


E b Ty(?P - ) : spec 


Eh P \ Ty(E) 




E b Ty(!? - ) : spec E b P 


: Ty (E) 


Eh P :E 




(Eq/Form) 


(Eq/Sat) 






E\- <j> :'P E \-<P : ~ EbPss 


A 


^bP:!!' E b Eq (</> : IP') : spec 


E b Eq(0 : ’E) : spec 




EbP: Eq(> : ' 


E) 




(Eq/Type) 




(Eq/Comp) 






EbP: Eq (0 : E) 




Eh P : Eq (0 : E) 






Eh P :E 




EbPfs^ 





Fig. 8. Specifications 






140 



J. Chrzfiszcz 



Let E 1 = v[ : Ty(t{) ...v{ \ Ty (t{) 

(IndSpec /Form) 

E b t j : Sj for j = 1 . . . k 
E\ E 1 I - tf : s'i for i = 1 . . . n 
E b lndSpec(l? i := E c ) : blocktype 
if POS B (Ff := E c ) 



(IndSpec/IndType) 

E p : lndSpec(Ff := E c ) 
E b pi : tf 
if je {1. ..k} 



and E c = vf : Ty(tf ) . . . vf : Ty(tf ) 

(IndSpec /Ind) 

E b lndSpec(Ff := E c ) : blocktype 
E b lnd(£b := E c ) : lndSpec(FT := E c ) 



(IndSpec/Constr) 

E\-p: lndSpec(£ i := E c ) 
E b p k +* : tf {v! p j }j= i...k 
if i £ (1 . . . n} 



Fig. 9. Inductive blocks 



Let E r = vi : Ty(fi) . . . v n : Ty(f n ) and 
R = {Ei : ef — ef : U } i= where E t = v\ : Ty(tj); . . . ;uj,. : Ty(4.) 



(RewSpec/Form) 



I? b tfe : Sfc for fc = 1 . . . n 
E\ Ei b ok E\ E r \ Ei b ti : Si ) 

F; E t \- ef : U E- E r ; F* b ef : U J 



for i = 1 . . . m 



E b RewSpec(I? r , R) : blocktype 
if ACC E (E r ,R) 



(RewSpec /Rew) 

E b RewSpec(I? r , R) : blocktype 
E b Rew (E r ,R) : RewSpec(f? r , _B) 
if COMPi3(£: r ,i?) 



(RewSpec/Sat) 

E b RewSp ec(E r ,R) : blocktype 
E b p? : tj for j = 1 . . . n 
E\ Ei b ef 6 ~ ef 0 for i = 1 . . . m 
E bp : RewSpec(F r , R) 
where 8 = { Vj i-> p 3 }j= i... n 



(RewSpec /Fun) 




(RewSpec/Comp) 


E p : RewSpec(I? r , R) 




E\-p: RewSpec (E r , R) S : Ei — » E 


E h pi : tj 




E b ef8S& ef8S 






where 8 = {vj p 3 }j= i... n 



The notation 5 : Ei — > E means that E b vS : tS holds for every v : Ty(f) 6 Ei. 



Fig. 10. Rewriting blocks 



(Block/Conv) 

E \~ B : BT E h BT' : blocktype 


E b BT rs BT’ 


E b B : BT ’ 





Fig. 11. Block conversion 
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The principal signature of the structure m (omitted in Ty(. . . ) above) can 
be obtained by removing the implementations (elements after := ) and replacing 
Struct with Sig. The principal signature of B defined by Module B:=A is more 
interesting. It is obtained by using the (Strengthening) rule (Fig. 7): 

Sig 

nat,0,S : Eq (A.nat, A.O,A.S : IndSpec (£' ot := 

E'nat) ) 

plus : Eq(A.plus : RewSpec{E^ lus , Rpi us )) 
two : Eq (plus (S O) ( S O) : nat) 

End 

Note that using (Sig/ Access) followed by (Eq/Type) we can derive the judg- 
ment E b B.nat, B.O, B.S : lndSpec(E^ Qt := E r p lat ) , meaning that B.nat.B.O, 
B.S form an inductive family, and using (Eq/Comp) instead, we derive the 
equalities B.nat « A.nat , B.O « A.O and B.S ~ A. S. 

Module subtyping is governed by (Sig/Sub) and (Funsig/Sub), as well as 
satisfaction rules (Ty/Sat), (Eq/Sat) and (RewSpec/Sat). Using the extrac- 
tion rules followed by the satisfaction rules it is possible to prove the following: 

Sig nat,0,S : Ty(lndSpec (E^ at := E% at )) End <: Sig nat : Ty(Set) End 

Sig plus : Ty(RewSpec (E plus , R p i us )) End Sig plus \ Ty (nat — y nat — y nat) End 

Rewrite specifications used in the parameter interface of a functor may ease the 
development of the functor body. They impose convertibility constraints on the 
functor arguments. Let us suppose that P : nat — > Prop is a predicate, and let 

Ej g = /: Ty (nat —> nat ); g : Ty (nat —> nat ) 

Rfg = n:Ty(nat) \ f (g n) — > g (f n ) : nat 

E^ = h:Ty(nat. — > nat) Rh = m:Ty(nat) | h m — > S(S m) : nat 

Let us now consider a functor F of the following type 

Funsig(X : Sig f,g : Ty(RewSpec (Ef g ,R fg )) End) 

Sig 

lemma : Eq( Xn:nat. Xp:P(X.f(X.f(X.g n))). p 

: Ilmnat. P(X.f(X.f(X.g n))) -»• P(X.g(X.f(X.f n))) ) 

End 

In the functor result type, correctness of the specification of its single lemma 
component relies on the rewrite rule placed in the functor parameter signature. 
Now F can be applied to a module path of type 

Sig 

g : Eq ( A n: nat. S n : nat —> nat ) 

/ : Ty( RewSp ec(E[,R h ) ) 

End 
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In fact, the rule (RewSpec/Sat) allows any sequence of function symbols to sat- 
isfy RewSpec(E^, R) as long as the corresponding conversion can prove equalities 
of left-hand and right-hand sides of all rules in R. In our example, the structure 
defines the functions / and g in such a way that both f(gn) and g(f n) are 
convertible with S(S(S n)), and therefore the equation expressed by the rewrite 
rule in Rf g is satisfied. 



5 Conservativity 

Our calculus of named modules is a conservative extension of the pure type 
system with generative definitions. This section presents the outline of the proof 
and a discussion about acceptance criteria for definitions by rewriring. 

Closure properties. The calculus of named modules is parametrized by the side- 
conditions for acceptance of inductive definitions and definitions by rewriting. 
In order to make these definitions coexist with modules and in particular with 
module subtyping and functor applications, the acceptance conditions must be 
closed under some operations on environments. Formally, each side-condition C 
(POS, ACC and COMP) must satisfy the following closure properties: 



C e(C) implies 


C if: 


(Cl) E = E 1 -v: S]E 2 


E' = Ei] E 2 {v i ^ p} with E-\ bp: S 


fl 


fl' = fl{v i-)- p}, 


(C2 )E = E i; E 3 


E' = Ei] E'i\ E 3 


fi 


fi' = n 


(C3) E = Ei] v : Ty(Sig Vi : 


Si . . . v n : S n End); E 2 


fl 


E' = Ei]Vi : Si ...v n : S n ] E 2 {v?Vi U*} 




fl' = fl{v.Vi 1 — > Vi} 


(C4) E = Ei]v : Ty(Funsig( 


v : M)M')] E 2 


fl 


E' = Ei ; E 2 



fl' = fl 

In all of the above implications it is assumed that both environments E and E' 
are correct. Moreover in the last two properties, E 2 does not contain modules. 

Even though these closure properties look complicated, the first two simply 
correspond to basic meta-theoretical properties of most type-systems: substitu- 
tivity and weakening. It has to be noticed that the condition (Cl) is at the same 
time simpler and harder than usual substitutivity. On one hand, only paths can 
be substituted which is a crucial simplification for existing syntactic termination 
criteria for rewriting. On the other hand, the typing judgment E\ b p : S can 
result from module subtyping or block satisfaction rules, which means that p 
can be more precisely specified than the original variables v. 
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The property (C2) simply corresponds to modularity of a given acceptance 
criterion and is generally needed for effective reloading of previously checked 
developments. The remaining conditions (C3) and (C4) are used in the conser- 
vativity proof presented below. 

Conservativity. In a calculus with modules, a proof development can be repre- 
sented as a structure well- typed in the empty environment. We show that every 
such structure ending in a proof of False can be transformed into a incon- 
sistent closed environment of the underlying pure type system with generative 
definitions. 

Theorem 1. Every structure well-typed in the empty environment, ending in 
a component v : Ty (False) := e can he transformed into a closed environment 
of the underlying pure type system with generative definitions, ending with v : 
Eq(e / : False) for some term e' . 

Proof sketch. Suppose first that the initial structure contains neither sub- 
modules nor manifest blocks. Then it is trivial to transform it into a closed 
environment of the underlying pure type system. 

Therefore it remains to be shown that any structure can be transformed 
into one without sub-modules and manifest blocks. Since every manifest block 
v : Eq(i/ : BT) := B in a structure without sub-modules can be eliminated by 
simply substituting v' to v in the rest of the structure, let us concentrate on 
sub-modules. The elimination of the latter is done in two phases. First, going 
from left to right, all module applications in the initial structure are recursively 
evaluated to weak-lread normal forms: we replace P 1 P 2 with m{v >->• P 2 } where 
Pi is already evaluated to Functor [v:M]m. Second, sub-modules are eliminated 
from right to left, by simply removing functors and flattening structures. 

Below we present the two phases of the conservativity proof in a bit more 
detailed way and explain the role of closure properties. 

Evaluation. While replacing the functor application P1P2 by m{v >->• P2}, one 
must make sure that the latter expression is well-typed given that m is well 
typed in an environment in which pi was defined, extended with v : Ty (M). If m 
contains definitions by rewriting or inductive definitions, their correctness after 
the substitution can only be proved if the side-conditions verify closure properties 
(Cl) and (C2). The latter assures that m is still correct in the environment where 
Pi is used extended with v : Ty (M) and (Cl) assures that the substitution 
{i> 1 — > P 2 } can safely be applied. 

The other difficulty in the first phase lies in the termination proof. Fortu- 
nately our calculus is similar to the calculus of Courant and after solving some 
technical problems we can use his results to obtain termination. 

Flattening. The second phase requires showing that module flattening preserves 
types. The simple proof by induction on the derivation does not work, so we have 
to introduce an auxiliary calculus, called principal, consisting of typing rules for 
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terms (Fig. 4) and rules for extracting information from the environment (framed 
rules in Fig. 6, 7, 8, 9 and 10). We show that term judgments derivable in both 
calculi in a given correct environment are the same. Since type preservation of 
flattening in the principal calculus can be proved by simple induction on the 
derivation, using closure properties (C3) and (C4), we are done. The complete 
proof can be found in [4]. □ 



Satisfying closure properties. Let us now fix the base pure type system to be the 
calculus of constructions. Since inductive types are often theoretically analysed 
as anonymous entities, all known positivity conditions POS for inductive types 
obviously satisfy the closure properties. 

It turns out also that syntactic order-based termination criteria for object- 
level rewriting, as defined in [2,19], restricted by the requirement that all function 
symbols appearing in the left-hand sides of rules of R belong to E r and the 
critical pairs test, also satisfy the above properties and therefore are suitable for 
ACC. In great simplification both termination criteria [2,19] restrict right-hand 
sides of rewrite rules to be constructed from local variables, previously defined 
function symbols, constructors of inductive types and recursive calls with smaller 
arguments. Smaller here means (extended) subterm relation for [2] or a recursive 
comparison for [19]. The rewriting system defined is Sec. 2 is accepted by both 
termination criteria and all its critical pairs are joinable. 

Due to modularity constraints it seems unlikely to allow foreign function 
symbols in left-hand sides of rules as in the third rule below: 

Functor [A : Sig Rewriting plus, mult : nat — > nat — > nat Rules R End] 

Struct 

Rewriting exp : nat — > nat —¥ nat 
Rules 

exp x 0 — > S 0 

exp x (S n) — > A. mult x (exp x n) 

exp x (A. plus n m) — > A. mult (exp x n) (exp x m) 

End 

Indeed, even if current rewrite rules R for plus and mult allow to prove local con- 
fluence of the system for exp, when some module is substituted to A, containing 
another definition of plus and mult, the system may stop being confluent. 

The status of the condition COMP is the least clear, because little is known 
about consistency of the calculus of constructions with rewriting. It is believed 
that completeness of definitions techniques of [6] requiring that all constructor 
instances of a function symbol are reducible, can be extended to the calculus 
of constructions and proved to guarantee consistency. Such a condition COMP 
would certainly satisfy the closure properties. Some work in this direction has 
already been done in [2], but the completeness criterion given in this paper is 
not modular. 
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6 Conclusions 

We have presented the formalization of a pure type system with generative in- 
ductive definitions and definitions by rewriting and showed that its extension 
with a system of named modules is conservative. The proof allowed us to formu- 
late closure properies which must be satisfied by acceptance criteria for rewrite 
rules in order to be useful in the calculus with modules. 

Since some existing acceptance criteria can already be adapted to satisfy the 
closure properties, the implementation of a term rewriting engine in Coq is by no 
means stopped by the modules. Still, finding really flexible acceptance criteria 
for rewriting is a challenging subject for future work. 
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Abstract. The rewriting calculus, also called p-calculus, is a framework embed- 
ding A-calculus and rewriting capabilities, by allowing abstraction not only on 
variables but also on patterns. The higher-order mechanisms of the A-calculus and 
the pattern matching facilities of the rewriting are then both available at the same 
level. Many type systems for the A-calculus can be generalized to the p-calculus: 
in this paper, we study extensively a first-order p-calculus a la Church, called p s ^. 
The type system of allows one to type (object oriented flavored) fixpoints, 
leading to an expressive and safe calculus. In particular, using pattern matching, 
one can encode and typecheck term rewriting systems in a natural and automatic 
way. Therefore, we can see our framework as a starting point for the theoretical 
basis of a powerful typed rewriting-based language. 

Keywords. Rewriting-calculus, Lambda-calculus, Object-calculus, Pattern Mat- 
ching, Fixpoints, Type Theory. 



1 Introduction 

It is not by chance that pattern matching appears as the core mechanism of term rewrit- 
ing: in fact, the ability to discriminate patterns is present since the beginning of infor- 
mation processing modeling. Pattern matching has also been widely used in functional 
programming (e.g. ML, Haskell, Scheme), logic programming (e.g. Prolog), rewrite 
based programming (e.g. Elan [5], Maude [16], script programming (e.g. sed, awk). 
It has been generally considered as a convenient mechanism for expressing complex 
requirements about the argument of a function, more than a real computation paradigm. 

The Rewriting Calculus, by unifying A-calculus and rewriting, makes all the basic 
ingredients of rewriting explicit objects, in particular the notions of rule application 
and result. Pattern matching can therefore be used widely, and a rewrite rule becomes a 
first-class object, which can be created, manipulated and modified by the calculus itself. 
We have already shown [8] that the first version of the rewriting calculus can be used as 
an operational semantics for rewriting based languages and in particular for Elan. For 
this we have used in the past fixpoint operators inspired from the ones of the A-calculus 
and thus untypable in the early version of the simply typed rewriting calculus [7]. 

Nevertheless, static analysis via a suitable typing system enforces a stronger pro- 
gramming discipline. The main objective of this paper is to present a p-calculus a la 
Church (/r^ k ) featuring first-order types and well-typed self-duplicating terms. 



S. Berardi, M. Coppo, and F. Damiani (Eds.): TYPES 2003, LNCS 3085, pp. 147-161, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 
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In p 5 ^ (typed) pattern matching is the basic mechanism for programming allowing 
one to build and typecheck non-normalizing terms: this enables the definition of some 
interesting functional recursion operators. Moreover, the type system of p s f^ is powerful 
enough to ensure well-typedness of matching equations, i.e. the instantiation of formal 
parameters complies with the typing discipline. Hence, p s f^ represents a good trade-off 
between the flexibility and the expressiveness of the untyped calculus, and the strictness 
of a more strongly typed one. This leads us to consider the presented typed system for 
p-calculus as a good candidate for giving the static semantics of a family of rewriting- 
based languages such as Elan, Maude, etc. 

One of the particularities of the type system of p s ff is that it relaxes, using the well- 
known result of N.P. Mendler [15], the classical property that “well-typed programs 
normalize”. More precisely, non- termination can be type-checked in p s f^ thanks to ad 
hoc patterns; it follows that, roughly speaking, an ML-like let becomes a letrec by 
abstracting over a suitable algebraic pattern P. 

Nevertheless, it is important to remark that when the type discipline is enhanced with 
dependent types, as it was done recently by the authors [3], all the programs presented in 
this paper are statically rejected, i.e. blocked by the type system. The chosen dependent 
type theory introduces pattern matching inside types, and matching failures significantly 
restrict the set of type-checked programs. In fact, the present paper does not fit into the 
philosophy of [3], where (dependent) type systems were studied especially for logical 
(proof-oriented) purposes and thus concerned with strong normalization of typable terms. 

Plan of the paper. In Section 2, we will describe the syntax and the evaluation rules of the 

. We will see how an equivalence on terms handles the undesirable matching failures. 
Section 3 describes the type system of p s f^. We give some simple type derivations to 
show how the type system deals with patterns and we state metatheoretical properties 
of p s f^. In Section 4, we explain how a careful use of the pattern matching capabilities 
allows us to encode various object calculi and term rewriting systems. 

2 The System p^ k 

This section presents the basis of p s * k : its syntax, its semantics and some examples 
showing its expressiveness. 

2.1 Syntax 

In this paper, we consider the meta-symbols >” (function- and type-abstraction), 
“[<C] ” (delayed matching constraint), an application operator denoted by concatena- 
tion, and (structure operator). We assume that the application operator associates to 
the left, while the other operators associate to the right. The priority of the application is 
higher than that of “[ -C ] ” which is higher than that of > “ which is, in turn, of higher 
priority than the The symbol r ranges over the set 7y of types, the symbol / ranges 
over the set Aj y of type constants (/Cy C 7y), the symbols A. /), C. . . . range over the 
set T of terms, the symbols X.Y. Z, . . . range over the set V of variables (V C T), the 
symbols a,b,c,... , /, g,h range over a set K, of term constants (/C C 7~). Finally, the 
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symbols P, Q range over the set V of patterns, (V C V C T). Sometimes we will use 
the “overloaded” symbol a £ V U /C, and we denote A for A\ ■ ■ ■ A n , for n > 0. The 
syntax is presented in Figure 1. The types are as one would expect from a first-order 



A 

P 

A 



L \ T —o T 

0 | A, X:t | A, f: T 

X | stk | f P (variables occur only once in any P) 
f | stk | X | P-> A A | [P<^ A A]A | A A | A, A 



Ty Types 
Contexts 
V Patterns 
T Terms 



Fig. 1 . Syntax of p s _l k 



type system, i.e. constant-types and arrow-types. The patterns are algebraic terms (i.e. 
terms constructed only with variables, constants and application) which can be used 
as left-hand sides of the rewrite rules; the set of patterns is obviously included in the 
set of terms. The well-known linearity restriction [17] is needed to keep the small-step 
semantics confluent. A rewrite rule of the form (P — A) abstracting over the free 
variables of P is a first-class citizen of the calculus. The types of the free variables of 
P are declared in A, i.e. F v(P) = Dom(Z\), resulting in a fully annotated calculus a la 
Church. An application is implicitly denoted by concatenation. The delayed matching 
constraint [P <Czi A]B can be seen as the term B with its free variables constrained by 
the matching between P and A. Again, the context A contains the type declarations of all 
the free variables appearing in the pattern P. A structure is a collection of terms that can 
be seen either as a set of rewrite rules or as a set of results. As we will see in Section 2.3, 
the symbol stk can be considered as the special constant representing a delayed matching 
constraint whose matching problem is unsolvable. An alternative approach would be to 
omit this symbol from the syntax but this has two drawbacks: first, the axioms of the 
theory presented in Def. 5 would become more complicated; moreover, we would lose 
the expression of first given in Section 4.2, and thus the proposed encoding of rewriting 
systems would be no longer possible. 

A type judgment (defined in Section 3) stating that a term A has the type r in a 
context P is written P h A : r. 



Free Variables and Substitutions. We introduce the notion of free variable and sub- 
stitution. 

Definition 1 (Free variables Fv). 



Fv(P — A) = Fv(A) \ Fv(P) 

Fv([P A\B) = Fv((P ->4 B) A) 
Fv(A B) = Fv(A, B) = Fv(A) U Fv(P) 



Fv(/) = 0 
Fv(stk) = 0 
Fv(X) = {A'} 
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As usual, we work modulo a-conversion and we adopt Barendregt’s “hygiene- 
convention” [2], i.e. free and bound variables have different names. This allows us to 
define substitutions quite straightforwardly, since it avoids issues like variable capture. 



Definition 2 (Substitutions). 

A substitution 8 is a mapping from the set of variables to the set of terms. A finite 
substitution 9 has the form {A\/X \ . . . and its domain {Xi, . . . ,X m } is 

denoted by Dom(0). The application of a substitution 9 to a term A, denoted by A9. is 
defined as follows: 



fe = f 


(P A)9 


= P -* a A6 


stk# = stk 


([P A]B)Q 


= [PC a A9]B6 


A ( Ai if Xi £ Dom(0) 
XiO = < 


(. AB)6 


= A6 B6 


1 Xi otherwise 


(A, B)0 


= A8, B6 



A substitution 9 is well-typed in context P if for any X £ Dom(0) such that P b X : r 
we have P b X9 : r. 



Matching Equations, Theories. The core mechanism of the rewriting calculus is pattern 
matching since, as we have already mentioned, when a delayed matching constraint is 
evaluated the corresponding matching problem should be solved. We define first the 
classical notions of matching equations and matching solutions. 

Definition 3 (Matching). 

T 

Given a theory T (i.e. a set of axioms defining a congruence relation =): 

1. A matching equation is a problem T = P -4<t A with P a pattern and A a term. 

2. A substitution 9 is a solution of the matching equation T if: 

a) P9 = A 

b) 9 is well-typed in any context r in which A and P are typable. 

The set of solutions of T is denoted by Sol(T). 

Different theories and the corresponding pattern matching problems can be formally 
defined and solved, for example, as explained in [9], By convention, if the solution of 
the equation A -«<t B is unique, it is denoted by 9^ . 

2.2 Operational Semantics of the General Rewriting Calculus, (F 

By now we have settled all the background necessary to describe in Figure 2 the re- 
duction rules of the general p-calculus, called fF , parameterized by the theory T. When 
instantiating T with concrete theories (e.g. theories containing axioms for associativity, 
associativity-commutativity, etc , for a given symbol) different versions of the calculus 
are obtained. When not essential or clear from the context, we will omit the theory T in 
rules and congruences. 

Let us quickly explain the top-level rules: 
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(P A) B -> p [P B]A 

[P B]A y fj A9\, . . . ,A6 n with {0 1; . . . ,#„} = Sol(P -4<x B) 
(A, B) C AC, BC 



Fig. 2. Top-level Rules of the General Rewriting Calculus, jf 



(p) this rule “fires” the application of an abstraction to a term, but does not immediately 
try to solve the associated matching equation. 

(ct) this rule is applied if (and only if) the matching equation P B has at least one 
solution: in this case the matching solutions are computed and applied to the term 
A. If the matching is not unitary, a structure collecting all the different results is 
obtained when the rule is applied. If there is no solution, this rule does not apply 
and thus, the term represents a matching failure. As we will see, further reductions 
or instantiations are likely to modify B so that the equation has a solution and the 
rule can be fired. 

(5) this rule distributes structures on the left-hand side of the application. This gives the 
possibility, for example, to apply in parallel two distinct pattern abstractions A and 
B to a term C. 

We denote by h> // 71 ' the contextual closure of these rules. Its reflexive and transitive closure 
is denoted H^. The symmetric and transitive closure of is denoted =p&. 

2.3 The Fixpoint Rewriting Calculus, p st ( k 

We present a version of the rewriting calculus that handles uniformly matching failures 
and eliminates them when not significant for the computation. We define the rules for 
handling this kind of terms and we show how these are integrated in the calculus. 

We define first a superposition relation C: PxT between (patterns and) terms 
whose aim is to characterize a broad class of matching equations that are potentially 
solvable. If P C A we say that “P does potentially superpose with A” and, by negation, 
if P % A then “P surely does not superpose with A” (i.e. independently of subsequent 
instantiations and reductions). 

Definition 4 (Superposition). 

1. The relation P C A is defined as follows by cases on the structure of P: 

f C / stk C stk X C A (VA) 

fAQB if (Be/B) AlCB 

( X V (Ai, A 2 ) V (Ai A 2 A Ai ^ P)V 
P C A if A = < 

1 ([ Q <^A Ai]A 2 A Q C Ai A P C A 2 ) (VP) 

2. If P C A is not satisfied we write P % A. 
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Starting from this relation, we define a reduction that eliminates from a p-term all the 
definitive stuck subterms, i.e. all the delayed matching constraints whose matching prob- 
lem is unsolvable independently of subsequent instantiations and reductions. 

Definition 5 (Stuck Theory, T st k). 

The relation — >- st k is defined by the following rules: 



[P A\B — ^stk stk if P ^ A (1) 

stk, A — >stk A (2) 

A, stk —t s tk A (3) 

stk A — tstk stk (4) 



We denote by i — t- st k the contextual closure of these rules. Its reflexive and transitive 
closure is denoted by H tt st k . The symmetric and transitive closure of i— » st k is denoted by 
==. Let T st k be the theory associated to the congruence ==. Matching equations in the 
theory T st k are denoted P -4< st k A 

As mentioned previously, these rules are used to propagate or eliminate the definitive 
stuck terms: 



- Structures can be seen as collections of results and thus we want to identify all the 
(matching) failures and eliminate them from these collections; this is done by the 
first rules (1 — 3); 

- On the other hand, a stk term can be seen as an empty set of results; the rule 
(4) corresponds then to the (5) rule dealing with empty structures and thus, to a 
propagation of the failure. 



Lemma 1 (Confluence and Termination of stk-reduction). 

The reduction i— ^tk is confluent and terminating. 

In general, matching modulo the T st k theory is obviously infinitary. When restricting 
to matching equations with an algebraic left-hand side we can still have an infinite 
number of solutions but a unique representative can be always characterized. Intuitively, 
in this latter case, the canonic solution of a matching equation is the solution obtained 
by a syntactic matching algorithm with all the terms reduced in i— >- st k -normal form. For 
example, since the solution of the equation / X -«t 0 f (a, stk) is {(a, stk) /X}. the 
solution of / X -« st k / (a, stk) is {a/A}, representing a witness for all the solutions 
with the shape {(stk, . . . , a, . . . , stk)/X}. 

Thus, for the sake of simplicity and in order to keep closer to possible implementa- 
tions, we define the underlying relation of the calculus: 

Definition 6 (Semantics of p st ; k ). 

The underlying relation of p s f^, denoted H ^ is defined as the relation H » st k U 

For the following holds: 

Theorem 1 (Church Rosser for p^ k ). 

The relation t — ^ is confluent. 
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3 The First-Order Type System for p^ k 

Figure 3 presents the typing rules of which are directly inspired by the simply typed 
A-calculus. 



a:r 6 P r 6 Ty 
r h a : t 



(Start) 



P, A h P : n r, A h A : r 2 
r \- P —o/±, A : Tl — > 7"2 



(Abs) 



r e Ty 



r h stk : r 



(Stuck) 



r\- A: T!->T2 P \- B : ti 
Ph AB :r 2 



(Appl) 



r\- A-.r P\- B :t 
r \- A, B : t 



(Struct) 



r,A\-P:n r F B : n P, Z\ h A : t 2 

fh[P« 4 B]A : 7-2 



(Match) 



Fig. 3. The Type System for p!! 



- (Start) : The context determines the type of variables and constants. It cannot contain 

two declarations for the same variable (or constant); 

- (Abs): As mentioned in Section 2.1, Dom(Z\) = Fv(P). For the left-hand side of the 

arrow-type, we use the type of the pattern P; notice that the (Abs) rule allows one to 
hide some type informations in a pattern containing applications, e.g. 72 disappear 
in the final type of (/ X ) in the judgment /:T 2 —> t\. X:t 2 b / X : n. 

- (Appl): We directly exploit the information given in the type of the function, statically 

checking that the given argument has the expected type t\ ; 

- (Struct): This rule states that all the members of a structure have the same type. This 

is important when considering structures as a collection of results; if a function can 
return different results, we would at least expect them to have the same type; 

- (Stuck): Since stk can appear in any structure, it can have any type; 

- (Match): This rule states that the constraint [P -C/y B]A gets the same type as 

(P — >/y A) B. This is sound since (P — >/y A) B — > p [P <4 B\A. Once again, 
Dom(Z\) = Fv(P). 

Example 1 (Simple type derivation). 

The (Appl) rule is effective for the typing of algebraic terms too. Let P = /:t —> l, a:t. 

P \~ f \ i —> t P \~ a : l 

(Appl) 

PV- f a:L 

and, let P = /:ti — > T 2 , g:iT —> t, a:r\. 

r, X-.Ti h f X : T2 P, X-.Ti I - gX-.L 

(Abs) 

P F f X ->(X: ri ) g X : 72 -> I P h f a : T 2 



B F (/ X - (X;n) g X) (fa)-.L 



(Appl) 
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This type system has been designed as a typing discipline for a programming lan- 
guage: its aim is to ensure that the arguments of a function have the same types as the 
corresponding formal parameters. However, the notion of (well-typed) pattern used here 
is crucial since it guarantees that the instantiation of the variables of the pattern will be 
correct with respect to types ( i.e . the substitutions obtained as result of the matching are 
well-typed) even if no type-checking is performed in the matching algorithm. 

We state in what follows the main properties of the typed calculus. 

Lemma 2 (Substitution Lemma). 

If r, A b A : t, then for any substitution 6 well-typed in r such that Dom(0) = 
Dom(Z\), we have r b Ad : r. 



Theorem 2 (Subject Reduction for p st ; k ). 
If F b A : r and A H ^ B, then F b B : r. 



Theorem 3 (Type Uniqueness for p s f^). 

If r b A : n and r b A : T 2 , and stk ^ A, then t\ = T2. 



Theorem 4 (Decidability of Typing for p s f , k ). 

If stk A and Fv(A) C Dom(T'), then the following problems are decidable: 

1. Type Reconstruction: for a given r, is there a type r such that r b A: r ? 

2. Type Checking: given a context r, and a type r, is it true that r b A: r ? 

4 Examples and Applications in p st ^ 

It has already been shown that the p-calculus allows one to faithfully encode first order 
term rewriting [8] as well as some classical object-calculi [9]. In this section, we show 
that /b tk is sufficiently expressive and flexible for a more concise encoding that does not 
break the type discipline for these two formalisms. In most of the section, some type 
decorations of variables and constants are omitted for the sake of readability. 



4.1 Encoding Abadi and Cardelli’s Object-Calculus 

In this section we briefly describe an encoding in the typed /b tk of the classical object- 
calculus gObj [ 1 ] . By better exploiting the pattern matching facilities of the p s f^ we obtain 
a more concise representation than the one given in previous works for the untyped 
p-calculus. A method is encoded as (m S) —> Tm 1 , where the constant m is the name 
of the method, the variable S will play the role of the keyword this (containing a copy 
of the object itself) and T m is a term encoding the body of the method. An object obj is 

1 In [9], the original encoding was m — » S — » T m , needing two reduction steps where one is 
enough with our enhanced encoding. 
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then a structure filled with methods. The method meth is then called by Kamin’s self 
application [ 13] which says that obj.meth = obj ( meth obj ): 

obj ( meth obj) = (. . . , meth S — > Tmeth, ■ • • ) ( meth obj) 

*-%£ • • • , [ meth S < meth obj]T me th ,■■■ 

| — 7meth[obj/S] 

Observe that the other methods fail because the equation (m S -« meth obj ) has 
no solution for every m ^ meth. The stk terms obtained for each of these method 
applications can be eliminated from the final result by successive i— >- st [< steps. The variable 
S is indeed instantiated with obj in the body of the method, allowing all the usual 
operations on the keyword this. 

As such, the previous example can be typed in the p-calculus as follows: lab is the 
constant type of labels, S has type lab —> r, and r is the type of Tmeth • For the sake 
of simplicity, we suppose obj has just one method triggered by the constant meth, with 
type ( lab — > r) — > lab. 

F h meth S : lab F b Tmeth ■ r 

(Abs) 

r h meth S — > Tneth '■ lab —o r 

Considering the meaning we want for S, it is sound that obj and S have the same type. 
Then obj.meth = obj ( meth obj) can be typed as follows (let T = meth:(lab — > r) — > 
lab , . . . ): 

r h obj : lab —t > r F h meth obj : lab 



F h obj ( meth obj) : r 

We end this subsection with an object-oriented version of u>u>, showing that the diver- 
gence of object-oriented programs is somehow built in the self-application. Remember 
that S.loop denotes the self application S ( loop S). 

h n = (loop S) —> S.loop : lab —> r 
fi. loop = ( loop S — > S.loop) ( loop 1?) 

I— ft. loop 

I • . . 



4.2 Encoding Term Rewriting Systems (TRS) 

The correspondence between the p-calculus and the TRS is not as straightforward as it 
may seem. Observe that a ^-abstraction is consumed by a /^-reduction, and therefore can 
operate only locally. For instance, the simple (one-rule) TRS consisting of f(X ) — > X 
reduces /(/(/(a))) to a. In the p-calculus, we can have control over the application of 
this rule: 

(/ X -* X) (/ (/ (/ a))) f (/ a) 

(. f Y - Y) ( (/ X -* X) (/ (/ (/ a))) ) Hfc, (/ Y - Y) (f (f a)) Hfc* / a 

In general, encoding (first and higher order) rewriting systems in the untyped p-calculus 
requires a complex translation mechanism [8,4]. 
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plus.rec ( add n m) 
i — $po 5 (addOY—>Y) {add rim), 

{add {sue X ) Y — > sue {plus.rec {add X Y))) {add n fn) 
i — $po 5 [add 0y< add n fn] Y, 

sue {plus.rec {add n — 1 fn)) 
i — »- s tk sue {plus.rec {add n — 1 fn)) 
i — yfafi sue {{add 0 Y — > Y) {add n — 1 fn), 

{add {sue X) Y — > sue {plus.rec {add X Y))) {add n — 1 fn) ) 



H »fx 5 sue {[add 01' < add n— 1 fn]Y, 

sue (• • • sue {{add 0 Y — > Y) {add 0 fn), 

{add {sue X) Y — » sue { plus.rec {add X y))) {add 0 fn)))) 
i — &atk sue {sue (• • • sue {{add 0 Y -> Y) {add 0 fn), 

{add {sue X) Y — > sue {plus.rec {add X Y))) {add 0 fn)))) 

i — fyxfi sue ( sue (• • • sue {[add 07 < add 0 fn]Y, 

[add ( sue A') Y add 0 m]{suc ( plus.rec {add X Y)))))) 

H-^ st k sue {sue (• • • sue {[add 07 < add 0 fn]Y))) 
sue ( sue (• • • ( sue fn))) 

= m+n 



Fig. 4. A complete reduction for a p-term encoding addition 



An (ad-hoc) Object-Oriented Encoding. We can define in the typed a suitable self- 
duplicating term that allows us to simulate the global behavior of a TRS 1Z. Let us begin 
with the example of addition, using two constants rec ( lab -* L - >L )-* lab and add L ^ L ^ L . 

Example 2 {Addition ). 

a q f a dd 0 Y —> Y, 

p us i ec ^ ( sue X) Y — > sue {S.rec {add X 7')) 

Intuitively, the variable S acts like the meta- variable this in JAVA and thus, the recursive 
application of the different rules is realized explicitly by using this variable in the right- 
hand side of the corresponding rules. 

This term computes indeed the addition over Peano integers, as illustrated in 
Fig. 4. The expressions “m”, and “m+n’\ and “ m—n ” are just aliases for the Peano 
representations of these numbers as sequences of suc{. . . {sue 0) . . . )). It is worth 
noticing that all the stuck results are dropped by i — >- st k ; the only interesting result is 
[add 07< add 0m]7. During the reduction, on the left of this term (or terms reduc- 
ing to it) all the terms get stuck because we try to match 0 against sue n; on the right 
too because we try to match sue X against 0. Notice that if we erase from the term pZits 
all the “administrative” subterms which encode the recursive machinery, we get back a 
TRS computing addition: 



add{ 0, Y) -s- Y 

add{suc{X) ,Y) —f suc{add{X,Y)) 
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Observe that, in this rather ad-hoc encoding of plus, we have put S.rec only before the 
symbol add in the right-hand side of the second rule because we know that it is the only 
position where further rewriting has to be done. In the next paragraph we describe a 
more general method for encoding a TRS. 



An Automated Encoding Using the first Operator. In this paragraph, we show that 
any convergent and well-typed TRS TZ can be mechanically encoded (and typechecked) 
in p s fj- Recall that a TRS TZ is convergent if it is confluent and (strongly) terminating; TZ 
is well-typed if all the rewrite rules can be typechecked with the same type for both sides 
of the arrow. The encoding is done by wrapping 7 Z into a typed object-based fixpoint 
engine which is a p-term that encodes all the rules in TZ and applies the translated rules 
recursively until none of them is applicable. Definition 7 details how this p-term is built: 
the right-hand sides are modified so that the whole system can be re-applied to any of 
the symbols appearing in the term. 

We first define the operator “first” that tries to apply successively n rules 
A\, A 2 , . . . A n to a term B and returns the result of the first rule whose application 
succeeds (i.e. does not reduce to stk). Here we use the constant stk to detect the failure 
of a given rule and the identity I = Y — > Y to yield a successful result: 

first(Ai, A 2 , . . . , A n ) = X -o ((stk -> A n X, I) (. . . (stk -> A 2 X, I) (AiX))) 

One can check that when we reduce first(A, B) C, if AC reduces to stk then the final 
result is the reduct of the term B C, since the stk produced by AC will be discarded 
by further ^ reductions even if it is accepted by the identity. If A C reduces to an 
algebraic term D different of stk, it passes through I and since the matching equation 
stk -4< D definitively fails (leading to a stk term), the final result is D. The same behavior 
is obtained for an arbitrary number of arguments for first. In particular, we will use 
the term first(Ai, . . . , A . n , I), which tries successively the n rules Ax, . . . A n on the 
argument B, and returns B unchanged if every ,4, B fails. 

From a typed point of view, the behavior of first is easy to understand: every ,4 , can 
be applied to X, so each must have a type r — > Ti where X : r. Moreover, for each 
i, we apply (stk — > Aj + 1 , 1) to AfX. Here the identity I has type r* — > Ti. Since all 
the members of a structure must have the same type, ,4,_ | X has type t, too. By trivial 
induction, all the A,; have the same type, which is the type of first(Ai, . . . , A n ). We 
can informally state this result by r h first : (r —> tq) (r —> To) — > r —> tq 

where r is such that r h A t : r — > tq for each A, : . 

In what follows, we denote by s,t, . . . algebraic terms (in the sense of the grammar 
x | f(t, . . . t)) and a term rewriting system by 7 Z= {ti —> SiY~ 1 " n . We write 1 1 — z t' 
when t can be rewritten to the term /' in normal form w.r.t. the TRS 1Z. 

Definition 7 (Object-based fixpoint engine). LetlZ = {t, -p .s, } ' -1 "" be an untyped 
TRS with terms built on a signature T = {ai, . . . , a m }; let S be afresh variable w.r.t. 
7 Z. The encoding ofTZ in p s fj is done as follows: 

- the terms ti and Si are transformed into p-terms using the translation (] — ).' 

(*) = A 

d f(ti, . . . , t„) D = / d u D d ti D • • • d t n D 
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- the p s ff -term encoding TZ is denoted (| TZ |): 

d *i D -** S.rec d si D, 

(|tn D -o S.rec d s n D, 
ai X — > S.Rec (ai & 

’’’ 5 

X — » S.Rec (a m 
d *i D -** S.rec d si D, 

d<„D -> S.rec d s„ D * 

I 

The result corresponding to the rewriting of an input term t w. r. t. to a TRS 1Z is computed 
by the p s f^-tenn (| 7 Z D .rec (| t [). 

This encoding enforces an outermost strategy: the p-term d 72. D .rec first tries to apply a 
rule at top level, and if no one succeeds it uses the rules (a* S.rec X), 1 < i < m to 
propagate the TRS deeper in the term. The second method, called by S.Rec, no longer 
needs to propagate the TRS inside the term because it is used only when the subterms 
have been totally reduced, thus the only possibly reducible position is the head of the 
term. Some more subtle combinations of the different rules in the TRS could lead to 
various interesting strategies like, for example, innermost or call-by-need. 

We prove that the encoding is faithful for TRS satisfying confluence and termination, 
which is often required in rewriting-based languages for (the part of the) rewriting 
systems that are not guided by a strategy. 

Theorem 5 (Soundness and Completeness of the Rewriting Engine). 

1. For any TRS 7 Z, and any algebraic input term t, if A is a p s ff-term in normal form 
w.r.t. i— 2^5 and without matching failures, then: 

dT^D-rec d f D A => h-^t' 

where d t! D = A. 

2. If the TRS TZ is well-typed and convergent, then for any algebraic terms t, t', 

Remark 1. These conditions are tight, in the sense that, for most of the non-confluent (or 
non-terminating) TRS, there is a term t and a reduction path 1. t' which can not be 
mimicked by d 72. D rec d t D - For the non-confluence this can be easily seen: our encoding 
enforces a particular strategy, so if two reduction paths are possible from a given term, 
then the engine has to choose one. 
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Thus, this encoding allows one to use the rewriting calculus in order to represent many 
well-typed TRS, and at the same time have a control over the correctness of the rules by 
means of a simple typechecking mechanism. 

Example 3 (Computing the length of a list). 

( len nil — > 0 , \ 

len ( cons X L) — > S.rec ( sue ( len L)) , I , 
sue X — » S.Rec sue ( S.rec X) ) 

( len nil — > 0 , \ 

len ( cons X L ) — » S.rec ( sue ( len L)), 

' J 



Type-checking all the Encodings. Each of the above encodings can be completed by 
a type-checking phase. The terms built in p s f^ can not be well-typed only if the initial 
rewriting system cannot be typed correctly. 

(plus) It is easy to check that the naive encoding of plus can be type checked with 
r h plusilab where r = rec:(lab —> l —> l) —> lab, add:t — > l — > 

l, suc.L —> t, 0:r. 

( (| 72. D ) The object-based fixpoint engine has type lab —> t —o t where r is the type of 
the data manipulated by TZ. Here rec and Rec both have type (lab — > r — > r) — > lab. 
There must be a unique type r for the data manipulated by the TRS: as we said 
before about first, all the A, must have a common type r —> tq. Since the identity I 
(applied if no rule in TZ can be used) has a type r — > r, all the rules in TZ must have 
the same type for their left and right-hand sides. This condition is not required in 
term rewriting systems, but it is generally imposed, for safety’s sake, in most of the 
languages based on rewriting (e.g. Elan [12], Maude [16]). 

(length) Similarly to plus , the term length type-checks with r h length : lab —> 
l — > t, where r = rec:(lab — > i — > t) — > lab,cons:i —> list — > list,len:list —> 

i, nildist. Notice that the type of the constant rec depends on the type of the data 
the TRS manipulates: we need in fact a whole class of rec constants (roughly one 
for each type) in order to write any fixpoint. 



5 Conclusions: Related and Future Work 

We have studied the expressive power of the simply typed p s f^. The defined type system 
was mainly adapted from the simply typed A-calculus. The type system deals with all 
the particularities of the calculus (abstraction over arbitrary patterns, delayed matching 
constraints, structures of rules and results). The most interesting properties of typed 
calculi are valid: subject reduction, and type uniqueness, and decidability of typing. 

Early versions of the (untyped) rewriting calculus have already been used to describe 
the implicit (leftmost-innermost) and user defined strategies of Elan [8] but some ad hoc 
operators were added to the basic calculus for this. The p s f^ presented here is a simpler 
formulation of the p-calculus essentially based on [ 1 1], where no new constructions have 
to be defined for expressing strategies for quite a large class of rewriting systems. The 
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price to pay for this simplicity is that the encoded rewriting systems we handle are not 
the most general ones but still the most used ones in practice. The ability to typecheck 
term rewriting systems and strategies ensures a good trade-off between expressiveness 
and safety of programs. 

Related Work. Some aspects of the relations between rewriting, A-calculus and types 
have already been explored. V. van Oostrom has widely studied the confluence of a 
A-calculus with patterns [17] but the presented language is untyped. D. Kesner, L. Puel 
and V. Tannen have proposed a typed pattern calculus [14] which has been designed 
as a computational interpretation of the Gentzen sequent proofs for the intuitionistic 
propositional logic. Our encoding of TRS shares some similarities with the one presented 
by S. Buyn et al. [6] that describes an untyped encoding of every strongly separable 
orthogonal TRS into A-calculus. 

The type system of p s ^ presented in this paper is quite different from the ones 
recently presented in the literature [10,3] since /A tk does not use dependent types and thus 
patterns do not occur in types. In particular, the results about uniqueness, decidability and 
non-normalization cannot be transposed straightforwardly to those (logic-oriented) type 
systems. Again, the main objective of /A 1 ) is to set the typed theoretical framework for a 
programming language featuring sophisticated and user customizable pattern matching 
facilities. 

Some similar ways of producing non-normalization appear in various formalisms. 
N. R Mendler [15] has shown that, when introducing recursive definitions in the typed 
A-calculus, strong normalization is no longer enforced by typing if the type constructors 
do not satisfy a “positiveness condition This kind of condition is still present in the 
Calculus of Inductive Constructions which is the basis of the Coq proof assistant. The 
issue appears in programming languages too: for instance, in ML, one can define any 
recursive function without using the keyword let rec. 

Therefore, the type system of is suitable for static analysis, i.e. it ensures that 
functions get arguments of the expected type. However, as a wanted feature it does not 
enforce termination of the typed terms. We have shown the encoding of some interesting 
terms leading to infinite reductions by the use of the pattern matching features of the 
calculus. The consequence is that our typing discipline fits for a programming language 
since we are interested in type consistency and in recursive (potentially non-terminating) 
programs. Conversely, it is not adapted for defining a Logical Framework , since nor- 
malization is strongly linked to consistency, and it definitively differs from previous 
proposals of the authors [10,3], 

Acknowledgment. The authors are sincerely grateful to Claude Kirchner for many 
fruitful discussions and invaluable comments about this work. 
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Abstract. In this paper we present a contraction- free sequent calculus 
including inductive definitions for the first-order intuitionistic logic. We 
show that it is a natural extension to Dycklioff’s LJT calculus and we 
prove the contraction- and cut-elimination properties, thus extending 
Dyckhoff’s result, in order to validate its use as a basis for proof-search 
procedures. Finally we describe the proof-search strategy used in our 
implementation as a tactic in the Coq proof assistant [2] . 



1 Introduction 

Most basic intuitionistic predicate calculi using sequents [9,4] include the struc- 
tural rule of contraction or a left-introduction rule for the arrow in which the 
principal formula stays in the left premise : 

T,A,A\-G r, A-^B b A T,BhG 

r,AhG Contr r, A^B b G L 

Those rules have obvious bad properties if we use them in a bottom-up proof- 
search procedure since they can lead to loops in the proof-search process if not 
restricted. 

In [5], Roy Dycklroff described LJT, a calculus for the intuitionistic proposi- 
tional logic without contraction. Instead he put forward that contraction could 
be shown admissible, i.e. it could be seen as an implicit rule in his system. Fur- 
thermore, he split the L— > rule in several subcases depending on the formula 
being on the left of the arrow, and that way avoided the repetition of the prin- 
cipal formula in the premise. 

In [6] , together with Sara Negri he gave a direct proof of cut-elimination for 
this system, and for its extension to first-order quantifiers V and 3. Of course 
this extension did not have any termination property similar to that of LJT 
because of the rules about the universal quantifier. 

The propositional part of the LJT sequent calculus has been implemented 
in the Coq proof assistant as a proof-search procedure : the tauto 1 tactic [12]. 
This procedure performs cleptlr-first-search of proofs with optimization of search 

1 in Coq v8.0, tactic names start with lower case letters. 
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using reversibility of rules in the calculus. This tactic is also used as a goal 
simplification procedure called intuition. The approach used was successful so 
we wanted to extend it to first-order reasoning. 

Moreover, two attempts at automating the predicate calculus in Coq were 
previously made : the first one was the implantation of a decision procedure for 
the direct predicate calculus [8,1], a decidable restriction of the predicate calculus 
to its linear fragment, it led to the linear tactic [7] which was implemented in 
early versions of Coq. It has been discontinued since, because this fragment is 
not powerful enough. 

The second attempt has been the port of the jprover module [14] from the 
Nuprl prover [10] to Coq. It is basically made of a classical tableau prover packed 
with a constraint solver to restrict it to intuitionistic logic. Similarly to linear 
this tactic behaves has a black box constructing a complete proof in one step. 
But it doesn’t handle the case of Wx.P [x] b 3 y.P [y\ where the domain must be 
inhabited, and it has a very restricted view of logical connectives. Moreover its 
black-box behavior forbids its use as a goal simplification procedure. 

Our purpose was to adapt Dycklroff’s system so that it could be used in 
a natural way for first-order intuitionistic proof-search in Coq. In order to do 
that we had to cope with the fact that in Coq only the implication — > and the 
universal quantification V are primitive constructions — they are two forms of 
dependent products — whereas standard logical connectives A,V,T and even the 
existential quantifier 3 can be defined in terms of inductive definitions. 

So we propose here a variant of Dyckhoff’s LJT calculus where the primitive 
logical connectives are V ,— > and inductive definitions, viewing other connectives 
as particular cases of inductive definitions, but also allowing many more possible 
constructions. 

In section 2, we first present our inductive definitions and the correspond- 
ing notion of first-order formula, and we show how this notion gives a natural 
extension of Dyckhoff’s calculus. Then in section 3 we prove that our calculus 
enjoys both contraction- and cut-elimination properties. Finally in section 4 we 
discuss some proof-search strategy issues and present our implementation of a 
proof-search procedure based on this calculus. 



2 A Sequent Calculus with Inductive Formulae 

2.1 Introducing Inductive Formulae 

In the following text we will use the notations Tti, pt^X, it and Vyt-X as short- 
cuts for H , , t , . . • , Hi, p-, Hi , i ^ ( . . . >(Hi, p yX ) ) , xi , . . . , Xp and tyi ,\ . . ■ ^!Ji,p - X . 
Please note that the length of the sequences is always fixed a priori, and that 
the meaning of pti depends on whether it is or not followed by an arrow. We sup- 
pose implication is right-associative and has higher priority than V. We will also 
use the {_}j notation to mean either a sequence of formulae or a (finite) set of 
premises in a rule, where i ranges over the constructor indices or the hypotheses 
indices of an inductive formula. 
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To define our class of formula we start with a signature of first-order constants 
and predicates of fixed arity. Any term formed by the application of a n-ary 
predicate to n well-formed terms possibly containing variables will be called 
atomic formula , and the variables occurring in the n terms will be called the 
free variables of this formula. 

Then we define compound formulae and inductive families mutually recur- 
sively, so let us begin with the inductive families. An inductive family is a triple 
(/, A, {Ci : ri; C 2 ■ 72 ; . . .}) where I is the name of the inductive family, X a 
possibly empty list of formal parameters having a fixed arity and being either 
propositional or first-order parameters. C,; is the name and r, the type of the ith 
constructor, which is itself a formula. 

Then we define our formula language inductively as follows: A formula is ei- 
ther an atomic formula or a compound formula. If A and B are formulae then so 
is A^B, if P [x] is a formula then so is Vx.P [x\ and if is a sequence of param- 
eters whose arity and class (formula or term) fit those of the formal parameters 
of the I family, then I(ff) is a formula. Implication and universal quantification 
behave as usual regarding free and bound variables. The free variables in /( ft) 
are those in ff. 

A constructor type must be a formula made of a (possibly empty) sequence 
of universal quantifications and implications and the head of that formula must 
be J(A^). The formal parameters must not be bound by the quantifiers. But all 
other free variables must be universally quantified. 

Without loss of generality we will assume that constructor types are in weak 
prenex form, i.e. all dependent products outermost, thus being of the form 
We will call T?i the logical hypotheses of the construc- 
tor and yt the first-order variables of the constructor. 

We suppose that inductive families we consider are neither recursive nor 
mutually recursive, i.e. the relation defined by the use of an inductive family in 
the logical hypotheses of another one is well-founded. 

Here we give a set of examples of inductive definitions defining standard 
connectives : 



(A, (A, B ), {pair : A-^-B^-A A B}) 

(V, ( A , B ), {inji : A— >A V B; inj r : B^A V B}) 

(-L, (),{}) 

(T,(),{fm;:T}) 

(3, ( H [_]), {witness : \/y.H [ y ] — >3x.H [x]}) 

The A and V inductive families have two propositional parameters of arity 
0, T and _L have none, and 3 has one propositional parameter of arity 1. 

Given those definitions, the meaning of inductive formulae is that 

I{-f) ^\J{3yt. 

i 3 



But since we use inductive type instead of just defining the connectives as 
plain identifiers, the elimination principles for the Calculus of Inductive Con- 
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r,P b P Ax 

r, a b b r,p,Bh g 

r h Ann R ~^ r, p, PnB h g La 
P, A, BnC I - B r,C\-G 
P, {AnB)nC h G L 
r\-A[x\ r,Mx.A\x\,A\t\bG 

RV x r 1 i LV 



r\-Vx.A[x] r, Vx.A [x] \- G 

P,(yx.A[x\)nBh\/x.A[x\ r,B\-G 
r,(Vi.i[i])->BhG 



LV— > 



{/•i II;j(jPTt)}j 

n-i(t) r,i(t)^G 



rj(~f)^BhG 
Fig. 1. The LJTI calculus 



L In- 



structions give us primitive left- and right-introduction rules built in the calculus 
[13], instead of unfolding the definition and dealing with standard connectives. 

Let us see somme more exotic examples : many specific predicates may be 
defined by non-recursive inductive definitions. For example we express that A 
satisfies the excluded-middle property using: 

(Dec, (A), {istrue : A— »Dec(A); is false : (A — >•_!_) — »Dec(A)}) 

Another example could be to express the Euclidean division of two natural num- 
bers. That is, Eucl_div(a, b) gives both witnesses q,r and proofs of r < b and 
a = bq + r. EucLdiv has two first-order parameters of arity 0 : 

(EucLdiv, (a, b),{EDintro : V< 7 .W.(r < b)n(a — bq + r)— >Eucl_div(a, b)j) 



2.2 The LJTI Sequent Calculus 

From now on we will assume that t ranges over first order terms, x, y over first- 
order variables A . . . G over arbitrary formulae, P, Q over atomic formulae, x, y 
over first-order variables, and P, P', P" over multisets of formulae. When we 
write P [a;] we assume that x is not free in P[y] if x y, and that any variable 
free in t is free in P [t] (we allow the use of a-renaming in P) . 

Using the definition of inductive formula above we define the LJTI sequent 
calculus in figure 1. Please note that using generic inductive definitions we have 
a smaller number of rules in our system than in LJT. Note that in axiom and 
Pa — > rules P must be an atomic formula. 

In the right introduction rule P/,, i ranges over the constructor indices, so 
there is one such rule for each constructor, and in the left introduction rule LI 
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existential variables yt. must follow the eigenvariable condition, and so must x 
in the f?V rule. This means x and yf must not occur free in the conclusion of 
those rules (or equivalently they must not occur free in T,G or ff). 

For instance, if we try to apply this scheme to _L, we get the following rules : 



(no R- L rule ) 



r,_L b g 



L_ L 



r b g 

r, b G 



L_ L-> 



You can check that the rules for _L match those for the standard connectives 
in [6] except for L_L which is a special case of weakening that is invertible (see 
rule (2) and rule (8)). For EucLdiv we would get : 



r \- r < b r b a = bq + r 
r b EucLdiv (a, b) 



f?Eucl_div 



r, r < b, a = bq + r b G 
r, Eucl_div(a, b) b G 



LEucLdiv 



r ', Vg.Vr.(r < b)— >(a = bq + r)^-A b G 
r , Eucl_div(a, &)— b G 



LEucLdiv — > 



In the LEucLdiv rule q and r mustn’t be free in L or G nor in a or b. 



3 Properties of the LJTI Calculus 

3.1 Inversion Lemmata 



We first give a series of lemmata about invertibility of rules, and admissibility 
of weakening. 

We say that a rule is admissible in LJTI if for every instance of the premise(s) 
that are derivable in LJTI , we get a derivation of the conclusion in LJTI. When 
there is only one premise in the rule, we say that this rule is strongly admissible 
if the derivation of the conclusion can be made shorter or of equal height than 
that of the premise, the height being 0 for an axiom and the maximum of the 
heights of the derivation of the premises plus one otherwise. 



Lemma 1. The following rules are strongly admissible in LJTI . 



T[x}LG [a] 
r[t]\-G[t] 



(1) 



r b g 

r,r'bG 



w 



(2) 



Proof. 

rule (1) : by structural induction on the derivation tree, renaming eigenvari- 
ables by induction hypothesis. 

rule (2) : By structural induction on the derivation tree, using rule (1) to 
rename eigenvariables where needed. 
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Lemma 2. The following rules are strongly admissible in LJTI : 



r b A-^-B 
r, a \- b 


(3) 


P,Vx.A [x] — b G 
P,PbG 


(6) 


r, p^b b g 


(4) 


r,i(t)*-G 


(7) 


P,P b G 


r,%,t)bG 


r, (C—tD)—>B b G 


(5) 


P, I{jf)^-B b G 


(8) 


P,B b G 





Proof. By induction on the height of the derivation, using rule (1) to rename 
eigenvariables and for rule (7). 

3.2 Admissibility of Contraction 

We first show that the generalized axiom rule is admissible in LJTI , and we 
obtain the admissibility of contraction which allows us to show the admissibility 
of generic L— > rules used in standard sequent calculi. 

To perform induction on formulae, we define a notion of weight which is given 
below : 

wt(P) = 1,P atomic 
wt(A— >B) = 1 + wt(A) + wt(B) 
wt(Vx.A [x]) = 1 + wt(A [x]) 
wt(7(^)) = £wt(C i( ^)) 

i 

if Ci( ~f) : then 

wt (CiCtf)) = (2 x length(yf)) + ^ 1 + wt 

3 

This weight is lower than the one in [6] in the case of disjunction, but in 
fact Dycklroff ’s proof is valid even with our weight. The essential fact about this 
weight is that the rules about inductive formulae that actually have premises, 
when read upward, replace their principal formula with strictly lighter formulae 
or remove them. 

In our proofs, Ind steps mean that we use the induction hypothesis, we use 
the double bar to distinguish those steps from the others. Admissible rules are 
labeled by the lemma in which they were introduced. 

Lemma 3. These sequents are provable in LJTI , even if A is not atomic : 
r,A\-A (9) r,A,A^B\-B (10) 

Proof. 1. We prove by induction on wt(A) that for any P we have P, A b A, 
by cases on the shape of A. 
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— If A = /( jt) then for all constructors Ci and logical hypotheses H; j we 
have by induction hypothesis U, yt) b so for each i 

we have yt) b I(ff) by Rp. Since we can choose yt that do not 

occur free in r nor in ff we can use LI to obtain r,I(j?) b /(jt). 

— If A = I(jf)^B, B being an arbitrary formula, for any constructor Ck 
and set of formulae A we have by induction hypothesis : 

r, A, Hlcf, zt)^B b Ht(t, 

Using rule (3) for each H^.j we get : 

r, A,Hi(t, zt)^B,Hl(t, zft'rB 

Now if we choose A = {Vyt.Etifjf, yt)— >B}i, A ' and A ' is the sequence 
of formulae obtained by instantiating one or more of the yfc by the ~zt in 
Vyt-Hkil? ,yt)^B. We can use LV for each with that formula and 
the formulae in A' and we obtain for each constructor C & : 

T, {Vyt.LtiCf, ti)^B}i,Hi(f, z£)\-B 

We can choose the ~zfc so that they are not free in U, ft or B and from 
there we have : 

{r,{\/ytM{t,yt)^B}i,Hl(t,zt) B} k 

r,i(t)^B,i(-f)t-B 
r, i(f)^B b i(-$)^b R ~* 

— The other cases are handled similarly to [6] . 

2. We have r, A-^rB b A^-B by rule (9), so we get r, A, A—^B b B by rule (3). 



Lemma 4. The following rule is admissible in LJTI : 

r bJ t,b^e 

r, D^B b E 1 } 

Proof. By induction on the height of the derivation d of the first premise and by 
cases on its last step. 

— If it is by an axiom then D is atomic and P — P’ 1 D. We have : 

r',D,B\-E 
r',D,D^B b E La ^ 
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If it is by Rli then let D = I(ff) : 

r,B\-E 



some Ind 



e ti 



some LV 



r',{Vyt.H k (?M)^B} k \-E 
r,i{-f)^B\-E LI 

— If it is by LI then E = r',I(p). We have for every constructor Cj : 

r,i(?),B\-E 

: rule (7) 

r’,Eti(t,Vt),B\-E 

=5 — : — : Ind 

r f ,Wi(t,ti),D^B\-E 

If we choose yt so that they are not free in r' ,ff ,B or E, we use LI and get 
r,I(-tf),D^B b E. 

— If it is by LI—> then r = r' ,I(~ft)—>C. We have : 

r',I(-f)^C,B\- E 

: rule (8) 

r\ {Wyt.ltiCf, b D r , y$)^C}i, BhE 

— : — : Ind 

r',{dyt.W i (t,ti)^C} i ,D^BhE 

r',I(-tf)^C,D^B b E LI ^ 

— See [6] for the other cases. 

Lemma 5. The following rule is admissible in LJTI : 

r, (C—>D)—>B b E 

r, C, D^B, D^B b E 1 ’ 

Proof. The interesting case where (C—>D)—>B is principal is treated similarly 
to [6], using rule (11) . 



Theorem 1 . The Contraction rule below is admissible in LJTI. 

r,d,dbG 

r,AbG Contr 

Proof. By lexicographic induction on wt(A) and the height of the derivation of 
the premise. If A is not principal in the last step deriving the premise, we use 
the induction hypothesis on the premise (s) of this step and apply the rule on the 
contracted premise. If A is principal, we do a case analysis on the shape of A. 
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— If A is an atomic formula P then the last rule is an axiom and G = P so the 
conclusion is an axiom. 

— If A = then we have for each i: 

: rule (7) 

: some Ind 

rXcf^^G 

Since the yt are not free in r or ff nor in G, we can use LI to get 

r,i(-f)LG. 

— If A = I(~jf)—>C then we have : 

r, {yyt.itiCf, yt)^c}i h g 

: rule (8) 

: some Ind 

r,i(-tf)^c\-G LI ^ 

— Otherwise we do as in [6] , using rule (12) when A = (C—>D)—>B. 

Which closes our proof by induction. 



Lemma 6. The following rule is admissible in LJTI : 

r, A^B b A r,BhG 
r, A-?B b G 



Proof. 



T,B\-G 

r,A^B\-A r, A->B, B\- G 



W 



r, A-^B, A-^rB b A 
r, A->B b G 



rule (11) 



Contr 



(13) 



This last lemma shows us that the LJTI calculus is complete with respect 
to the calculus which could be named LJI where the axiom rule would be the 
generalized one and all left arrow rules would be replaced by the one from the 
lemma. 



3.3 Cut-Elimination Theorem 

The proof outline follows that of [6] except that with our notion of inductive 
formula there are fewer cases to consider. 
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Theorem 2. The Cut rule below is admissible in LJTI. 

r\-A r,A\-E 

- r r , p E Cut 

Proof. By lexicographic induction on wt(A) and on the sum of the heights of 
the derivations of the premises : 

If the first premise is an axiom, let T = T", A : 

r',A\-E 
r”, p',a\-e w 

If the second premise is an axiom, either E £ r' or A = E and the conclusion 
is an axiom. 

Otherwise, neither premise is an axiom. 

If A is not principal on the left, by cases on the last step of the left derivation : 
— LI : We have 



hA}i 

r", /(it) I- a L1 r',A\-E 

r»,i(-f),r'\-E 

For each i we use the induction hypothesis : 

a r'.Ah e 



Cut 






Cut 



After renaming yt if they occur free in r' or E, We use LI and obtain 

r",i(t),r'hE. 

LI-> 

r",{vgt.T? i (t,Vt)^C} i \-A 
r",i(ff)—>c h a 

= Cut 



LI r',A\-E 



becomes : 



r",i{tf)->c, r'\-E 



r", {\/y$.Lti(jf, yt)^>C}i F A r',AhE 



Cut 



LI — > 



r",{iyt.7t i (tM)^c} i ,r'hE 

r” , i (~jf)—>c, r'LE 

— other cases are dealt with similarly to [6]. 

If A is principal on the left and not on the right, by cases on the last step of 
the right premise derivation : 
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Rli : We have 

{r',Ah 

r\-A r',A\-i(-f) i 

= 

r,r\-i(-f) 

For each j we use the induction hypothesis : 

r\- a f,a\- 

; k Cut 

r,r' h H itj (f,ti) 

And we use the Rli rule to get r, F' h 

U A 

{r",A,W i (?,y})\- Eh 
r\-A r",A,irf )\- e L1 

r,r",i(f)\-E 

For each i we use the induction hypothesis : 

r \- A r",AX(t,yt)^ E 
— » , . Cut 

After renaming yt if they occur free in F, we use LI to get F, F", /( ~ft) h E. 

a ^ > 

r",A,{yti-mt,yt)^c}i\-E 
FhA r",A, I{f)^C\- E LI ^ 

rc,,f 

F, F", I{-f)^C h E 

loppnmpc: ■ 

FhA E ^ * 

r,r",{vtirfi(t,yl)^c}i\-E ut 
F,F",/(fHCh E L1 ^ 

Otherwise, see [6]. 

If A is principal in both premises, by cases on the shape of A : 

A = I(jt) 

{n-H itJ (j t,t!)h jjt {r r ,Hlef ,yt) h E} k 
Fh/(^) i F',/(^)h£ 

F,F'h£ Cut 



becomes : 



r\lf t at,yt)hE 

: rule (1) 

{FhFyt ?,t)h r r , if) h F 
r.r' h e 



some Cut 
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— A = I (-$)-> B 

r h i{-f)^B R ~^ r, i{-f)^B h e LI 
r,r'\-E = 

For each constructor index i we can turn our proof of r,I(~ft) b B into a 
proof of ,y$) b B by rule (7), choosing the yt free in BAjtt and E. 

Thus we can use R—> and f?V to obtain r b Vyt , yt)— >B. 

Then for each i, we do a cut on \/yt.Tti(jt, yt)—^B with the second premise 
E , {\/yt.Et i ('jt, yt)^B}i b E and we get the sequent r, ... ,F, F' b E which 
we reduce to r,r'\-E by making contractions on r. 

— The other cases are dealt with as in [6] 

This closes our proof by induction. 

This gives us the cut-elimination property for LJTI by removing the topmost 
cuts first. 

4 Embedding Our Calculus in a Proof-Search Procedure 

4.1 Basic Strategy 

To perform bottom-up proof-search using our calculus, we use bounded depth- 
first search, using our bound on non-decreasing rules. 

We first notice that we can do without the atomicity condition in Ax and 
La— > rules, since those generalized rules are admissible : for Ax see rule (9), 
and for La— > use rule (10), cut and contraction. This can speed up proofs by 
avoiding the destruction of two opposite occurrences of the same compound 
formula followed by as many axiom rules as the number of its subformulae. 

In order to refine our strategies we separated the inductive families in classes. 
First we distinguish between first-order inductive whose constructors may have 
first-order (quantified) variables, and propositional inductive families whose con- 
structors are propositional formulae, and among them we have three classes : 

— Those with no constructor are the absurd class (for instance _L) 

— Those with one constructor are the conjunctive class (A,T,. . . ) 

— Those with more than one constructors are the disjunctive class (V,Dec,. . . ) 

Of course the axiom and left-absurdity rules are to be used as soon as possible. 
Moreover, it is fundamental that we try to apply the generalized La— > rule before 
trying any LI—> rule in order to shortcut that part of parallel destruction. 

This calculus also has a lot of invertible rules which must be used before 
the non-invertible ones, because there will be no need to backtrack if the proof 
fails next. Notice that for the conjunctive class, the right introduction rule is 
invertible. Moreover, some rules like TV — > and L— >— > are only partially invertible, 
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ST+(A) = +A 

SP+ (A—tB) = SF~ (A) U SP+ ( B ) 
SP+(Vx.P [*]) = SP+(P [?„]) 

SP+(I(t)) = U I !/:.,[ /iVbiii 



SJ- {A) = —A (A atomic) 

SP- (A^B) = SP+(A) U SP-(B) 
SP-(Vx.P[x])=SP-(P [?„]) 

SP-(I(t)) = U.j SP-(HijCfM)) 



(where ?„ and are fresh metavariables) 



ST(P b G) = SP + {G) U (J SP~(H) 

Her 

Fig. 2. signed atomic subformulae 



so we first try to prove the non-invertible premise and if we succeed there will 
be no need to backtrack if the second premise fails. 

The last point is that some rules generate more than one subgoal to be 
proved, so we try to delay them as much as possible. 



4.2 Instantiation Strategy 

When all else fails we try to apply instantiating rules LV and RI, with I a first- 
order inductive. To use those rules some terms t must replace the quantified 
variable(s). To find these terms, we use a well-known notion of polarity (see for 
instance [11]) to define the set of signed atomic subformulae SP(r b G) of a 
sequent by induction on the structure of its formulae (see figure 2). 

We remark that signed atomic subformulae in premises of rules are also in 
the conclusion, maybe in a more general form (with some terms replaced with 
metavariables). This can be seen as a kind of subformula property in our calculus, 
and in the end we only need pairs of matching subformulae of opposite signs used 
in axiom or La— > rules, and inductive formulae with terminal rules : negative 
absurdity or positive tautology 2 . We call those particular subformulae trivial 
subformulae, and they are also necessary in a derivation. 

When we want to use a trivial subformula under a quantifier or an inductive 
definition to prove our sequent, we just need any term t to instantiate our quan- 
tified variable, in order to bring that trivial subformula to the top and apply 
a terminating rule, so we create a goal stating we have a term to instantiate 
our variable and ask Coq to use trivial or auto to solve that non-logical goal. 
We have to use this trick because in Coq, unlike first-order logic, the quantifica- 
tion domain may be empty, and this emptiness is undecidable in general (type 
inhabitation is what Coq is all about). 

Otherwise we try to build matching pairs of atomic subformulae, and that 
we do by using first-order unification between atomic subformulae of opposite 
sign, and by looking at the terms associated to the quantified variables in the 
unifiers, for example, if we have to prove that \/x.P [x\ b 3 y.P [f(y)], we have the 

2 We call tautology any propositional inductive family with a constant constructor 
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signed atomic subformulae — P[?i] and +P[/(? 2 )]- And we have {? 1 i-»- /(? 2 )} 
as a unifier. So we will try to use a term of the form /(? 2 ) to instantiate ?i. 

Now, we can get three different kinds of terms to instantiate our variables: 
ground terms (without metavariables) , open terms (containing metavariables but 
not outermost), or trivial terms (equal to a metavariable). 

If we get ground terms we just use them so, turning Vx.P [a:] into P [t\. 

If we get open terms, we specialize our quantified formulae: in our example 
with /(? 2 ), we turn \/x.P [ 2 :] in Vt/.P [/(?/)]■ For positive inductive formulae we do 
the same and we use 3 to quantify over open positions in the term. For instance 
if we consider the following goal : 

P, Vx.Vy.y = 2 x f(y , x) + lh Eucl_div(a, 2) 

The unification algorithm will yield /(a, ?i) for q and 1 for r, and the special- 
ization scheme will give the following goals to try to prove : 

P, MxNy.y = 2 x f(y, x) + lh 3a;. 1 < 2 A a = 2 x /(a, a;) + 1 

If we get trivial terms, it means that there is a formula of opposite sign that 
unifies with this one and that this one doesn’t need to be specialized, this is 
the case for example in Va :.P [x\ h 3 y.P [y] . In that case, we proceed like we do 
with trivial subformulae and we get an additional Coq subgoal about domain 
inhabitation. Having destroyed our quantifier, we can hope the search procedure 
will finally bring the matching subformula in outermost position. 

You can argue that our specialization scheme leads to non-termination, but 
in fact the calculus itself doesn’t terminate so we just place a counter on the use 
of those rules plus the PV — > rule, and we give a bound to our search procedure. 

4.3 The firstorder Tactic 

As announced earlier, this proof-search procedure is available in Coq. Since our 
experience in maintaining the tauto/intuition tactic showed us that a lot of 
time was spend doing pattern matching on contexts (see [3] ) we decided to avoid 
doing it too often. 

So we decided to work at the ML level with a persistent data structure 
reflecting the logical content of the current subgoal. Since we are keeping track 
of the head-form of our formulae, we can work modulo constant unfolding and 
/3i-reduction at a very low performance cost. The unification algorithm also 
does some reduction, but it is basically first-order unification since we are not 
supposed to have any quantified variable at the head of an application. 

This implementation choice gave very encouraging results when compared to 
tauto. In some propositional examples firstorder solved the goal in less than 
1 minute where tauto ran overnight without giving a result. For example try 
(AqoA!)— >-(A 1 oA 2 ), (A!oA 2 )— >-(A 2 oAo), (A 2 oAo)--i(Aoii'A 1 ) b Aq-^Ai 
with a bigger odd number of variables. 

The firstorder tactic is available in Coq v8.0 and can be used like tauto. 
A global integer option may be set using the command (Set Firstorder Depth. 
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n). This option is the maximum number of non-terminating rules allowed in a 
branch of the proof, so increasing it may allow your goal to be solved at the cost 
of a longer search time. 

However, in the current state, all propositional inductive definitions are sup- 
ported but first-order ones are only supported when they have one constructor 
with only one first-order variable. We are planning to fully support first-order 
inductive families in the near future. 

5 Conclusion and Future Work 

We have presented a contraction-free sequent calculus to deal with first-order 
intuitionistic logic in the Coq proof assistant where most connectives are defined 
as inductive families. We have shown that this contraction- free calculus enjoys 
admissibility of contraction and cut-elimination, thus establishing a weak form 
of subformula property. We have shown how this calculus was implemented as a 
proof-search tactic in Coq. 

Although our inductive formulae do not have more expressivity than stan- 
dard first-order intuitionistic logic, they give a more uniform reasoning frame- 
work. From a more practical point of view they allow users to define their own 
connectives without having to consider if they would be supported by such or 
such automatic tactic. 

Beside searching for smarter search strategies, there are two directions in 
which we should extend our work to be able to deal with inductive predicates 
such as le (less than or equal to). One is to try to consider recursive inductive 
definitions for which our theorems don’t apply as such: the weight function 
is not well-defined and we may lose completeness. The other one is to have 
inductive predicates whose type in Coq would be an arity instead of a simple 
sort, but equality is one of those predicates so we couldn’t avoid performing 
some equational reasoning. Indeed, we plan to handle such cases as part of a 
more general integration of equational reasoning and our proof-search procedure 
in Coq. 
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Abstract. Higher-Order Linear Ramified Recurrence (HOLRR) is a 
linear (affine) A-calculus — every variable occurs at most once — ex- 
tended with a recursive scheme on free algebras. Two simple condi- 
tions on type derivations enforce both polytime completeness and a 
strong notion of polytime soundness on typeable terms. Completeness 
for PTIME holds by embedding Leivant’s ramified recurrence on words 
into HOLRR. Soundness is established at all types — and not only for 
first order terms. Type connectives are limited to tensor and linear im- 
plication. Moreover, typing rules are given as a simple deductive system. 



1 Introduction 

The main goal of giving machine-independent characterizations of PTIME is to 
overcome the drawback of conceiving feasible algorithms by thinking directly in 
terms of low-level machine primitives, like those of Turing machines. The research 
about this subject has brought forth a wide variety of interesting calculi, that 
can be classified under two parameters: their originating background, and their 
expressivity — the ability to naturally express higher-order functions. 

Concerning the originating background, proposals range from those which 
are purely recursion-theoretical to the ones which are purely proof-theoretical. 

For example, Bellantoni and Cook’s safe recursion on notation [3] is of the 
former kind. Its recursive scheme forbids application of a recursively defined 
function to the result of a recursive call. The constraint is expressed directly 
inside the syntax of the recursive schemes, by distinguishing two classes of ar- 
guments, namely safe and normal arguments. Another example of a first-order 
function algebra capturing PTIME is Leivant’s ramified recurrence on words 
[11,12], which relies on the notion of tier to control the use of arguments in 
recursive schemes. 

On the other side, purely proof-theoretical systems are logical, deductive sys- 
tems, usually expressed on a graph language, that of proof-nets. Main examples 
of this class are light linear logic (LLL, [6]), light affine logic (LAL, [1,2]), and 
soft linear logic (SLL, [10]). Boxes are certain regions inside a proof-net, and a 
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box may contain other boxes, in a stratified fashion. The computational core is 
cut-elimination, whose complexity is controlled by box stratification: the time 
necessary to normalize a proof-net is a polynomial in the size of the proof, the 
exponent of the polynomial depending only on the box-nesting depth. This, to- 
gether with the fact that usual data types can be coded by fixed-depth proofs, 
implies polytime soundness. 

Many interesting systems should be classified in-between these two styles. In 
these systems, recursion is embedded into typed calculi, and other mechanisms 
- usually ramification or linearity — are needed to control the computational 
complexity growth. Typical examples of such systems, here dubbed as type- 
theoretical, are HOSLR [8,9] and LT [4,5]. In the two systems mentioned, the 
syntax of Godel’s system T is modified to accommodate safe recursion, but a 
number of additional constraints, a restricted form of linearity in primis, are 
needed to guarantee polynomial soundness. Generalizations of ramified recur- 
rence to higher-order types are presented in [15,14]. In these systems, however, 
the lack of any linearity constraint prevents from getting a polytime bound. 
Indeed, at higher types they show either a poly-space or a Kalmar elementary 
bound. Another related work is [13], where syntactical restrictions on a sim- 
ply typed calculus with constants and recursion allow to restrict the space of 
representable functions to relevant complexity classes. 

Results. In this paper, we introduce the system of Higher-Order Linear Ram- 
ified Recurrence (HOLRR). It is a type theoretical system smoothly blending 
both recursion and proof-theoretic components. 

The proof-theoretical core of the system is a linear affine A-calculus: any 
variable can be used at most once. Recursion is embedded in the system as a 
variable binder, whose syntax is inspired by boxes of linear lambda calculi. The 
types are generated by the usual multiplicative connectives (tensor and arrow). 
Base types includes denumerably many copies of several free algebras. There is 
no need for additional type constructs; in particular, there is no explicit modality. 

Our principal aim is to obtain results akin to those of Bellantoni, Niggl and 
Sclrwichtenberg’s LT [4,5], but in a framework with a polynomial bound ex- 
pressed as a function of specific parameters of the term. Sect. 5 analytically 
compares the two systems. Here, we stress that: (i) no additional syntactic re- 
striction on terms is needed, besides those induced by typeability; (ii) the degree 
of the polynomial bounding normalization time of a term M depends only on 
one parameter of a type derivation for M — its recursion depth. 

In particular, we prove a soundness result a la LLL. Under a given strategy, 
any term which can be typed satisfying two simple conditions ( word-contextuality 
and ramification) normalizes in a polynomially bounded time. To be precise, we 
will prove that, for any (word-contextual and ramified) type derivation ir for M, 
M normalizes in time 0(\M\ h ), where h depends only on the recursion depth of 
7 r. This means that, whenever the recursion depth of type derivations for terms 
encoding input data is bounded, the defining function is polytime — a similar 
situation occurring in LLL or LAL. 
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Completeness for PTIME holds by embedding Leivant’s ramified recurrence 
on words into HOLRR. 

2 Syntax 

A free algebra A is a couple (Ca,72a) where Ca = {cf 1 , . . . , c^ A )} is a finite set 
of constructors and 72 a : Ca — > N maps every constructor to its arity. A free 
algebra A = ({c^, . . . , c^( A )}, 72a) is a word algebra if 

• 72 (cf ) = 0 for one (and only one) i G {1, . . . , k( A)}; 

• 7 Z(cj) = 1 for every j ^ i in {1, . . . , k( A)}. 

If A = ({cf, . . . , cj^}, 72 a) is a word algebra, we will assume to be the 
distinguished element of Ca whose arity is 0 and ci, . . . , c^( a)-i will denote the 
elements of Ca whose arity is 1. B = ({cf, elf, c®}, 72 b) is the word algebra of 
binary strings. C = ({cf , c?r }, 72c), where 72c (cf) = 2 and TZc( c 2 ) = 0 is the free 
algebra of binary trees. 

A will be a fixed, finite family {A^ . . . , A„} of free algebras, where construc- 
tor sets Caj , . . . , Ca„ are assumed to be pairwise disjoint. We will hereby assume 
both B and C to be in A. 

The language Ma of HOLRR terms is defined by the following productions: 

M ::= x | c | (M, M) \ MM \ Xx.M | let (x, x) <= M in M \ 

{{M, , M }} [x /M, ... ,x/M] M | ((M, . . . , M)) [x/M, ... ,x/M] M 

where c ranges over the constructors for the free algebras in A. An occurrence 
of a term N inside another term M has recursion degree n if it is nested into n 
terms in the form ((M, . . . , M)) inside M. When we write a term as M, we are 
implicitly assuming it to be closed (i.e. to contain no free variables). 

The language Ta of HOLRR types is defined by the following productions: 

A ::= Bl \ A ® A \ A -o A 

where n ranges over N and A ranges over A. Tensor associates to the left, both 
in types and terms (that is, pairs). A € Ta, define the lifting #(A) G Ta of A: 

#(B2) = Bl +1 

ff(ADB) = #(A)D#(B) with □ G {O,^}. 

The level L(A) G N of a type A is defined by induction on A: 

HBl) = n 

L {A®B) = L (A B) =max{L(A),L(R)}. 

The index set 1(A) C N of A is defined in a similar way: 



1{BI) = {n} 

I (A ®B)= I (A —oB)= 1(A) U 1(B). 
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The rules in Fig. 1 define the assignment of types in Ta to terms in Ma- A type 
derivation n with conclusion T h M : A will be denoted by n : T b M : A. If 
there is n : r h M : A then we will write T Fh M : A and mark M as a typeable 
HOLRR term. is the set of HOLRR typeable terms. A type derivation 
7r : r h M : A is in standard form if T does not contain variables introduced by 
rule W. 



x : A h x : A 



A 



r h M : B 
r,x : Ah M : B 



w 



r,x: Ah M : B _T I M : A —o B Ah N : A 
r h A x.M : A-o B ^ r, Ah MN : B 

rh M : A Ah N : B j r\ M : A® B A, x : A,y : B h N : C 
r, A h (M, N) : A ® B ® r,Ah let(a, y) <= M in N : C 

n £ N c £ Ca 

h c: B ? f ^ Bl ^ B2 c 

n A ( C ) times 

A = B’ h r = Xi : Bi, . . . , x„ : B n 
B h Af a : A — o . . . — o T — o C Aih Ni-.Bi Oh L: A 

C i ' v ' 

n A ( C f) times ^ 

Ai,... , A n ,Oh IM C1 ---M ck ^[xi/N u ... , Xn/Nn] L : C E ~ < 

A = B\ B = xi : Bi, ... ,x n : B n 
r h M a '. A — o . . . — o A — o c — ° . . . — ° C — o c 

C i ' V / ' V / 

R./ ( fh j times R / (h- ) times 

Ai h Ni : Bi Oh L: A 

jfjR 

A u ..., A„, O h ((M C1 ■ ■ ■ M Ck ))[xi/N u ..., x n /Nn} L : C ^ 



Fig. 1. Type assignment rules 



The recursion depth K(7t) of a HOLRR type derivation n : T h M : A is 
defined by induction on the structure of it. In particular: 

• If 7T is an instance of rules A or 7 C , then R(-7r) = 0. 

• If the last rule used in 7 r is E^, then tt has the following shape 

7Ti ... 7T m Oh L : B l A 
A, Oh «M l5 . . . , M n ))[x i/TVr, . . . , x m /N m ] L : C 



and K(7r) is * + max{M(7Ti), . . . ,K(7r m )}. 
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• In all the other cases, ir can be written as follows 

7Tl . . . 7 Tm 

P b M : A . 

We will define R(7 t) as max{M(7Ti), . . . , R(7r m )}. 

Proposition 1 . If P h H M : A and A,x : A bn N : B, then P, A bn 
N{M/x} : B. 

Proof. Induction on the structure of the derivation for A, x : A b N : B. □ 

For every term t of a free algebra A £ A and for every natural number n, there 
is an HOLRR type derivation n(t,n ) : b t : B This allows to prove: 

Proposition 2. If x\ : A\, . . . ,x n : A n bn M : B, then X\ : ff(A \), ... ,x n : 
#(A„) b H M : #(£) 

The reduction rule — > on Ma is given in Fig. 2; is the contextual closure 
of — K is locally confluent and strongly normalizable, property provable by 

embedding the calculus into system T; so, it is Church- Rosser as well. 

Redexes in the form (( M Cl , . . . , M Ck ))[xi/Ni , . . . , x n /N n ] t are called recur- 
sive redexes ; those in the form {{M Cl , . . . , M Ck fj{x\/Ni , . . . , x n /N n ] t. are condi- 
tional redexes ; all the others are called linear redexes. 



(A x.M)N -»• M{N/x} 

let (*, y) 4= (M, N) in L -> L{M/ar, AT/y} 

{(M Cl ,... ,M Ck ))[xi/Ni,. .. , x n /N n ] a(ti,... ,tv.(c p) -»■ 
M h {N\/xi,. . . , A"„/®n} ti • • • trc(c 4 ) 

({M C1 ,. . . , M Cfc ))[xi/IVi, . . . ,a:„/A„] A 

, M Cfe ))[®i/Ai, . . . ,®»7iV n ] t K(ci) 

{{ A/ci 7 ■ ■ ■ 5 Af Cfe }} [ 3:1 /Ah , . . . , Xn/Nn] Ci (tl, . . . , tfK Ci ')) > 

M Ci {Ni/xi, . . . , W/a:„} ti ■ ■ -t K ^ Ci ) 

Fig. 2. Normalization on terms 



The following proposition will be useful in the following 

Proposition 3. If b M : </ien </ie (unique) normal form of M is a free 

algebra term t. Moreover, t can be obtained from M by successively firing redexes 
with nidi recursion degree. 

Proof. By a standard reducibility argument. □ 
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M R contains terms that cannot be reduced in polynomial time. To enforce this 
property, we introduce the following two conditions on type derivations: 

• a type derivation 7 r is word- contextual if every occurrence of Bi in every 
instance of E^, inside 7 r, has form B$, W being a word algebra. 

• a type derivation tv is ramified if every instance of E R inside n satisfies 
L(A) > L(G). 

In the following section, we will show that these conditions are both crucial to 
reach polytime soundness. If 7r : P b M : A, where n is word-contextual and 
ramified, M is said to be word-ramified and we will write n : T Lwr M : A. 
The class of all word-ramified HOLRR terms will be denoted as M)^ r . 

3 Polytime Soundness 

The goal is to prove poly time soundness for HOLRR in the form of 

Theorem 1. There is a sound and complete normalization strategy such that 
the time required to normalize a term M is 0(\M\ h ) where h only depends on 
K(7t) , 7r : T b M : A being word- contextual and ramified. 

The reduction strategy we use proceeds by firing the rightmost innermost redex 
among those with minimum recursion degree, where the firing of a recursive 
redex corresponds to a complete unfolding, counted as a single step. Rightmost 
innermost minimum recursion degree strategy is the name of such a reduction 
strategy, and M i-a N denotes that M rewrites to N by one of its possible steps. 

We will prove Theorem 1 studying normalization by way of interaction 
graphs, which are graphs corresponding to HOLRR type derivations. Notice 
that we will not use interaction graphs as a virtual machine computing normal 
forms — they are merely a tool facilitating the study of HOLRR dynamics. 
Let La be the set 

{W,I^,,E^,I 9 ,E 9 ,P,C}U IJ [J{I c }U{EZ,Et,P C ,P R }. 

AeA CSC* 

Elements of La either are typing rule names or lie in {P, G, P c , P R } — they are 
premises (P), conclusions (C) or limit conditionals ( P c ) and recursions (P«). 
An interaction graph is a quadruple (V, E, a, 0) such that 

• (V, E) is a directed graph; 

• ol : V — y La 

• 0:E-+ T a 

Ga is the set of all interaction graphs. We will now introduce a class G R of 
interaction graphs corresponding to HOLRR type derivations. G R is defined 
inductively, mimicking the process of type derivation building. First, the inter- 
action graphs in figure 3(a) lie in G R ■ Moreover, suppose Go, . . . , Gk(A)+n G G R 
and they have form as in Fig. 3(b); then all the interaction graphs depicted in 
Fig. 4 lie in G R , provided the constraints listed next to each graph are satisfied. 

To every HOLRR type derivation 7r : P b M : A corresponds an interaction 
graph Q(i r) £ G R . Moreover, every instance of rules 7_o, 2£_o, I®, E<$,I C , p£,, Efi^ 
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(a) (b) 



Fig. 3. Some interaction graphs. 



in 7 r corresponds to a vertex in Q(ti) having the same label. In particular, if v 
corresponds to an instance h c : B — 0...—0 B of rule I c , then S(v) is the 
integer n, and, if this occurrence of c has recursion degree m, then 'y(v) is m. 

Lemma 1. There are two constants n,m £ Q such that, for every it : T h M : A 
in standard form, we have n\M\ < |V| < m\M\, where G(tt) = (V, E,a, (3). 

Inside a given graph, we call traps those subgraphs corresponding to normal form 
derivations n(t, n) : b t : B ^ (where W is a word algebra) and ending on the last 
E-o (for instance, Fig. 5 shows the trap corresponding to 7r(cf c^cfc®, 0). We are 
here interested in certain paths inside interaction graphs: given an interaction 
graph G = (V, E, a, (3), an n-typed path of G is a sequence v = Vi, . . . , v m £ V + 
such that the two following conditions hold: 

• for every i £ {1, . . . , m — 1}, either (t>i, nj+i) £ E and n £ I((3(vi, n,;+i)) or 
(vi+i ,Vi) £ E and n £ l(f3(v i+1 , vf)), and 

• for every * £ {1, . . . , m — 1}, if Vi is part of a trap, then Vi+i , ... , v m must 
all be part of the same trap. 

Intuitively, when a typed path enters a trap, it cannot exit it. 

Suppose v is a vertex of G, <f> is a positive integer and if is a nonnegative 
integer. Then the weight of v is defined by cases: 

( 1 if a(v) = I c A S(v) > if 

W M {v) = l 0 if a(v)^I c 

^ (j) jj 0 ( 77 ) — I c A S(v) < ip 

The weight W^^iy) of a n-typed path v = Vi, . . . ,v m is )C u6 { t)1 Vm } 

where every vertex counts once even if it occurs many times in v. The n-weight 
(G) of an interaction graph G is just the maximum among W^^v) over 
all n-typed paths v inside G. The weight of an interaction graph is parametric 
on (p and ip. The following result, however, holds for every </> and ip. 

Lemma 2. Let ttm ■ T \- M : A and suppose ttm contains a subderivation in 
the form n(t, n). Then > |t|- 

Remark 1. Basic observations are worth doing to understand the proof of propo- 
sition 4 below. The goal is to understand how W£ and 
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n(l) = »(2) = . : . = »«A)) 

#fc(A)+« A\ ... -<4fe( A ) , * € {1, ■ • • ,«(*)} 

Bi D D . ~° P ~° • • • E. -o E,i 6 fl. . . ■ , fc(A)) 

TC*(c*) times TC A (c*) times 



n(l) = n( 2) = . . . = n(k( A)) 

S k(A)+i ^1 ••• ^fe(A))* € {i, — ,»(*)} 

^ P-o...-°D -°E,ie fc(A)) 

TC A (c*) times 



Bo D B™ 



Bo D Bf 



Fig. 4. Inductive cases 



relate each other, when tt m : r h WR M : A , and ttm : r h W R N : A , and M 
rewrites to N. 

Rewriting on terms, as in Fig. 2, is matched by certain transformations on the 
corresponding graphs, described in Fig. 6. The graph transformations take into 
account the modifications on the graphs, but for the erasure of sub-terms, which 
is the computational effect of weakening. When describing the modifications on 
the graphs induced by the firing of a redex, it is always understood that after any 
transformation as in Fig. 6, one should also perform all those transformations 
which correspond to the deletion of a sub-term as caused by a substitution for 
a weakened variable. These transformations can always be written as the one in 
Fig. 7, where G only depends on the term being deleted. We remark once again 
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Fig. 5. The trap corresponding to 7r(c?C2C?C3, 0). 



that we use graphs as a mere tool for the study of the complexity of reduction, 
and not as a kind of computational device implementing reduction. 

As a first case, assume M yields N by firing a linear redex. Then, Q(ttm) 
transforms to by one between the rules in Fig. 6(a) or Fig. 6(b). Then, 

G(ttn) has less vertices than G(ttm) and for every n-typed path v in Q(i tn), there 
is a corresponding n-typed path w in G(ttm), with W^^(v) < W^(w). 

Assume, instead, to fire a conditional redex, namely 
{{M Cl , . . . , M Ck }} [a,'i /TVj , . . . , x m /N m ] t. At the graph level we need to 
focus on the transformation in Fig. 6(c), where K will contain at most k nodes, 
all labeled with E_o, while t\, . . . , fy all are sub-terms of t. Again, G(ttn) has 
less vertices than G{t^m) and for every n-typed path v in G(ttn), there is a 
corresponding n-typed path w in G(ttm), with 

Finally, assume to fire a recursive redex. By proposition 3, it must be in the 
form ((M Cl , . . . , M Ck ))[xi/si, . . . ,x m /s m ] t, where t,s i,... ,s m are free-algebra 
terms. At the graph level, the transformation behaves as in Fig. 6(d), where: 

• For every i £ {1, . . . , /}, there is a constructor c £ Ca such that c(t\ . . .t\.) 
is a sub-term of t; 

• K, K\ , , Ki contain nodes v such that a(v) = E^,; 

• |t| bounds both l and the number of vertices in K\ 

• For every i € { 1, . . . , Z}, the number of vertices in Ki is bounded by kf, 

• Ei, ... ,Ei all are in the form D . . . —o D. 

If P > n, then W^(G(tt n )) < W^(G(n M ))- for every p-typed path v in- 
side G(ttn), there is a corresponding p-typed path w inside G(ttm), where 
Assume now that p < n. Certainly, any p-typed path v 
inside G{^n) can be mimicked by a p-typed path w inside G{^m) hr such a way 
that a constructor vertex in w corresponds to every constructor vertex appearing 
in v. This correspondence, however, is not injective. Whenever u is a vertex ap- 
pearing in w and belonging to G(M Ci ), v can contain distinct U\, . . . ,Uj (where 
j < l), all of them being “copies” of u. On the other hand, all the equations 
7(wi) = . . . = 7 (v,j) = 7 (u) — 1 hold. Notice that, by our definition of a p-typed 
path, if u belongs to the trap G(si), then v can only contain one copy of u. 

The remarks here above lead to the following, crucial, result: 

Proposition 4. There is a function f : N — > N such that, for every word- 
contextual and ramified ttm : T \- M : A, if M i — >* N and t is a free algebra 
term appearing in N, then |t| = 0(|M|- /(R ^ M h). 
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Fig. 6. The graph transformations produced by the firing of a redex. 



Proof. Let = (V, E,a, /3). We will prove that, for every n,m £ N, if 

M n N, then 



^|V|,R(7T M )(^( 7rAr )) — ^V ; |,R(7TM)(^( 7rM )) 

- | jyj(R(7r M ) + l) R(,rM) “’” +1 



z/ R(7 Tm) < m 
otherwise 



(1) 

(2) 
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I" 

® 



Fig. 7. The graph transformation induced by the substitution for a weakened variable. 



First of all, let us consider the case n = 0. N equals M, so (1) is trivially verified. 
Suppose v = i>i, . . . , Up be an m-typed path inside If m < R®) then, by 

definition, W\ v \^ M ){v) < |V|. If m > R(7t), then 

Wj V |, R 

< |y|R(7rM)(R(^M) + l) R(,rM) - m + l 

< |y|(R( 7 TM) + l)(R(^M) + l) R(,rM) - m 
= |y|(R( 7 r M ) + l) B(7r " ) - m+1 



As a consequence, (2) holds. 

Suppose now that n > 0 and that the thesis holds for n — 1. By remarks 1 
and by the induction hypothesis, we only have to show that (1) is preserved 
by recursion unfolding — in other cases, path weights cannot increase. If m > 
R(ttm), then even recursion unfolding do not increase the weight of m-typed 
paths. If m < M(7Tm), let w be an m-typed path in Q(j tn) and let v the m- 
typed path in G(^m) that corresponds to w. As discussed previously, for every 
vertex u appearing in v, w may contain several distinct vertices ui , . . . , Uj . all 
corresponding to u. By lemma 2, j < and by the induction 

hypothesis 



E w ]v i,R (7rM) K-) = e ®r (Ui)(R(7rM)+1)B< ’ M> '" 



i= 1 



i = 1 
3 



( 7 (u)-l)(R(7r M )+l) E(7rM)_m 

( 7 (u)- 1)(R(7 r M )+l) E(7rM) - mN 



= £|ci 

i= 1 

< ( n/i(R(7TM)+i) R(,rM) - m 



This, by lemma 1, concludes the proof. 



□ 



Proposition 5. There is a function g : N — N such that, if ttm : T b M : A 
is word- contextual and ramified, the number of recursive redexes fired during 
normalization is 0(\M\ 9 ( r (' Km ' , ' > ). 
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Proof. A recursive redex with recursion degree m can be copied 
times during normalization, as can be proved from proposition 4 by induction 
on m. Now, notice that m < R( 7 Tm). As a consequence, the function g(n) = 
nf(n ) + 1 is a suitable bound. □ 

Summing up, proposition 4 gives a bound on the size of free algebra terms 
appearing inside reducts of a given (word-ramified) term M . This is proved by 
showing that (for every n £ N) the n - weight of the underlying interaction graph 
does not increase during normalization. This result, by itself, does not prove 
anything on the complexity of normalization. Proposition 5, however, exploits 
it by bounding the total number of recursive redexes fired during normalization 
of M. So, the proof of Theorem 1 can follow. From proposition 5, the number 
of recursive redexes the normalization fires is 0 (|M| 9 ( r ( 7 I ' m ) 1 ), where g : N — > N 
does not depend on \M\. By proposition 4, the time to unfold a recursive redex is 
itself 0(|M|-^ R ( 7rM ))), where / : N — > N does not depend on \M\. Finally, notice 
that, by firing a linear or conditional redex, the underlying interaction graph 
shrinks. This concludes the proof. 

4 Polytime Completeness 

This property holds by representing predicative sorting into HOLRR. Predica- 
tive sorting, introduced below, reformulates ramified recurrence (or predicative 
recursion) on words [11]; given a word algebra W, predicative recursion is a 
function algebra generating all, and only, the polynomial functions in the form 
/ : W" — » W. Predicative sorting on W follows: 

1. The function / c w : W° — > W that returns can be predicatively sorted 

by e — > n, for every n £ N, £ being the empty sequence; 

2. For every i £ {1, . . . ,k( W) — 1}, the function f c w : W — > W defined by 
/ c w(t) = cj* t can be predicatively sorted by (n) —> n for every n £ N; 

3. For every n £ N and 1 < i < n, the projection 7 r” : W" —> W can be 
predicatively sorted by (mi,... ,m n ) — > m for every m, mi,... , m n £ N, 
with mi = m; 

4. If / : W” —> W can be predicatively sorted by (mi,... ,m n ) —> m and 
gi , . . . ,g n : W p —> W are such that gi can be predicatively sorted by (n , . . . , 
r p ) —> mi, then the function h : W p — > W defined by the equation 

h{t\, . ... , tp) = / ( g\ (f i , • • • , tp) , ... , g n (ti , • • • , t p ))) 

can be predicatively sorted by (r i , . . . , r p ) —> m; 

5. Suppose for every i £ {1, . . . , k(W) — 1} there is a function /) : W™ +1 — > W 
that can be predicatively sorted by (l, mi,... ,m n ) —> m and that fk(w) ’■ 
W” — > W can be predicatively sorted by (mi, . . . , m n ) —> m. Then the func- 
tion h : W 1+n — > W defined by 

h(cf ,t n ) = fi(t,t i,... ,t n ,) 

W) ’ ^1 j * • * 5 ^n) = fk( W) (^1 j • • • 7 ^n) • 
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can be predicatively sorted by (7, m 1, . . . , m n ) to . 

6. Suppose for every i € {1, . . . , k( W) — 1} there is a function : W n+2 — t W 
that can be predicatively sorted by (l, mi, . . . , m n , to.) — > m and that fk(w) '■ 
W" — > W can be predicatively sorted by (mi, . . . , m n ) — > in. Then a function 
h : W 1+ ” — »• W can be defined recursively: 

He? ,t n ) = ,t n )) 

^'( C fc(W) ) ) ■ ■ ■ i^n) — /fc(W) (^1 1 • • • 5 fft ) • 

If l > m, then h can be predicatively sorted by (Z, mi, . . . , m n ) —> m. 

By the definition here above, if / : W" —> W can be predicatively sorted, then 
it is definable by predicative recursion. 

Remark 2. If / can be predicatively sorted by (mi, . . . ,m n ) —> m and m, < 
m, then / is independent from its i-tli argument (see [8]). We will suppose 
that, in rule 3, mi, . . . ,m n > m. This ensures that, if / can be predicatively 
sorted by (mi, . . . , m n ) — > m, then mi, . . . ,m n > m, simplifying the proof of 
completeness, without loss of generality. 



Theorem 2 (Completeness). Assume f : W" — > W be predicatively sorted by 
(mi, . . . , m„) — > m. There is a closed term Mf that represents f, whose type can 
be B If 0 ■ ■ ■ 0 ByJ —o Byy, where l, li, . . . , l n € N and, for every i £ {1, . . . , n}, 
either mi = m and f = l, or mi > m and U > l. 

The proof uses the definition of the terms: 



Coerc : —o B$ 

Duplicate : B^, —o B$ 0 B ^ 

V(M) : B^ 0 • • • ® B^ 0 B% — B% 



such that: 



Coerc(f) t 
Duplicate(f) * (t, t) 

V(M)(t i, ... ,t p ,t) * M(ti, . . . ,t p ,t,t) 



where t : B^, tj : Bfy (j £ {1, . . . ,p}), M : B^ 0 ■ ■ ■ 0 B ^ 0 B^ 0 B ^ ° Byy, 

n € N and m, l < n. In particular: 

• Coerc is Ax. ((A y.cfy, ... , A y.cf^^y, cf (j4) » x; 

• Duplicate is Ax. ((Mi, . . . , Mk(w))) x, where, for every i £ {1, . . . , fc(W) — 1}, 
Mi is A y. let (z,w) <= y in (cfz,cfw) and M fc(w) is (c^ w) ,c^ w) ); 

• V(M) is 



Xy. let ( 2/1 , — ,y n ,w) 4= y in ((L, . . . , L, P))[x\/y\, . . . ,x n /y n ,z/w\ c^c^ {w) 
where L = Xx.Xy.M{x \, . . . , x n , z , y) and P = z. 
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5 Comparison with Previous Work 

There are a number of type systems with the same goal as HOLRR [4, 5, 7, 9] . The 
most similar is certainly LT, introduced in [4] and later refined in [5]. HOLRR 
and LT are designed from different starting points. LT is basically a restriction 
of Godel system T, extending the ideas of safe recursion [3] to the higher-order. 
HOLRR, on the other hand, is obtained by endowing linear affine lambda 
calculus with constants, conditionals and recursions, somehow being inspired by 
Leivant’s ramified recurrence on words. 

Linearity is a key ingredient to control the complexity of normalization, in 
presence of higher-order recursion. The terms of LT are not strictly linear: free 
variables of ground types can appear more than once. On the other side, any 
variable of HOLRR occurs at most once in a typeable term, and recursion 
preserves this constraint. The strict linearity of HOLRR fits precisely with the 
introduction of linear arrows, when discharging an assumption. 

Ramification and safety are other tools to get rid of exponential growth. LT 
models safety by distinguishing among complete and incomplete variables and 
by using two families of arrows and products, with careful constraints on their 
interplay. HOLRR ramification has the same flavor as in the original work on 
ramified recurrence on words [11,12], without any major change. 

HOLRR accommodates generic free algebras in a uniform way, with just 
one recursion scheme. LT is only about word algebras: the introduction of tree 
algebras would require to extend the linear discipline to ground variables [9] . 

In both cases, the system is polytime complete. However, polytime soundness 
is formulated and proved in two different ways. In LT, any term M with free 
variables aq, . . . , x n is equipped with a polynomial Pm(v i, • • • , y n ) in such a way 
that the time to normalize M{N\/y \, .. . , N n /y n } is 0 (Pm(\Ni\, ■■ ■ , |iV n |)); this 
result, however, relies on a number of assumptions: all the terms involved must 
have linear type, N± , . . . , N n must all be closed and cannot contain complete 
free variables of higher type, all free variables of M{N\/y \, . . . , N n /y n } have to 
be linear and incomplete. This means that there is no evident relation between 
the structure of M and the degree of Pm- On the contrary, the time needed 
to compute the normal form of every word-contextual term M of HOLRR is 
0(\M\ h ), h only depending on the recursion depth of a type derivation for M. 
In particular, the recursion depth of any type derivation for any term of any free 
algebra is null. We claim that our soundness theorem is deeper and more general 
than the one on LT. 



6 Relaxing Conditions on Type Derivations 



If we drop word-contextuality and ramification, we immediately get outside 
PTIME. For example, if we allow L(H) to be equal to L(C) in rule E^, we can 
build a term M with b M : B g — o B ^ such that 



M 



B B B B 

L 'n L 'ii L 'i2 L 'i2 * ’ * 



c 



B 

3 - 
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Iterating M, we easily obtain an exponential behavior. Assume now that, in 
rule E B \, ... , B n are arbitrary types, namely that type derivations are not 
word-contextual. The term Xx.Xy.Xz.x{yz) encoding function composition can 
be given type (Bg -° Bg) -o (Bg Bg) -° (5g B°); using the obvious 

generalization of V, we obtain a term N with type (B„ — ° Sg) — ° (Bg — ° I?g) 
encoding self application. Again, iterating N yields an exponential blow up. The 
same problem occurs by starting from cf with type B g — o — o 

7 Conclusions 

We provide a higher-order system that embeds, quite naturally, Leivant’s rami- 
fied recurrence on words. 

A final remark about soundness follows. If 7 tm : T b M : A is a word- 
contextual ramified derivation, we obtain a bound for some 

suitable /. Now, the exponent does depend on M. But suppose 7 tm '■ r b 
M : A —o B and 7 tn : A b N : A to be word-contextual and ramified. The 
type derivation ttmn '■ C A b MN : B is word-contextual and ramified, and 
R(ttmjv) = max{K(7TM), R(ttat)}. Taking M as a program, the time to compute 
M on argument N is 0(\MN\^ r ^ mn ^) — a polynomial on \N\ whenever inputs 
to M have bounded recursion depth. This includes all the cases where inputs are 
closed normal forms of a base type. Future work addresses the characterization 
of higher-order types whose normal forms all have the same recursion depth. 
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Abstract. In a previous work we introduced the generalised multiary 
\-calculus AJ m , an extension of the A-calculus where functions can be 
applied to lists of arguments (a feature which we call “multiarity” ) and 
encompassing “generalised” eliminations of von Plato. In this paper we 
prove confluence and strong normalisation of the reduction relations of 
AJ m . Proofs of these results lift corresponding ones obtained by Joachim- 
ski and Matthes for the system A J. Such lifting requires the study of how 
multiarity and some forms of generality can express each other. This 
study identifies a variant of AJ, and another system isomorphic to it, as 
being the subsystems of AJ m with, respectively, minimal and maximal 
use of multiarity. We argue then that AJ m is the system with the right 
use of multiarity. 



1 Introduction 

In [2] we defined the generalised multiary A-calculus AJ m , an extension of the 
A-calculus where application is generalised in two directions: (i) “generality” , in 
the sense of von Plato’s generalised eliminations [7]; and (ii) “multiarity”, i.e. 
the ability of applying functions to lists of arguments. The original motivation 
was to extend Schwichtenberg’s work on permutative conversions for intuitionis- 
tic cut-free sequent calculus [6] . A J m comes equipped with a set of permutative 
conversions for which the permutability theorem holds: two AJ m -terms deter- 
mine the same A-term iff they are inter-permutable. We established confluence 
and strong normalisation of these conversions. 

In this paper we study confluence and strong normalisation for the reduction 
rules of AJ m . Our strategy is to use corresponding properties of the system AJ 
of Joachimski and Matthes [4,5] (the type-theoretic counterpart to von Plato’s 
natural deduction system with generalised eliminations). This is a natural ap- 
proach because AJ may be seen as a notational variant of a subsystem of AJ m 
called AJ. 

We lift the results of AJ to AJ m via a mapping v whose idea is to express 
multiarity by means of generality. To fully achieve this we also need another 
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sidade do Minho, and also by the thematic network APPSEM II; the second author 
was also supported by the thematic network TYPES. 
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mapping y, which expresses certain uses of generality by multiarity and which 
calculates the normal forms for the reduction rule of AJ m with the same name. It 
follows that /i and v are inverse bijections between /x-normal forms and terms of 
AJ. We develop this idea and investigate how these mappings preserve reduction. 
It turns out that a slight variant of AJ is isomorphic to the subsystem of AJ m 
determined by the fi- normal forms. 

This emphasis on how multiarity and generality may express each other con- 
trasts with that in [21 , where multiarity and generality are studied as independent 
features of AJ m . 

This paper is organised as follows: Section 2 reviews AJ m and its subsystem 
AJ; Section 3 studies mappings y and v and establishes the above mentioned 
isomorphism; Section 4 proves various results of concluence and strong normal- 
isation; Section 5 concludes. 

Notations: Let R be a binary relation over an inductively defined set of expres- 
sions. — denotes the compatible closure of R. — and — ^ denote respectively 
the transitive; and the reflexive and transitive closure of —>r. Given relations 
R and S, we write R , S and RS for RL) S and S o R, respectively, whenever 
convenient. 



2 AJ m : The Generalised Multiary A-Calculus 

2.1 Expressions and Typing Rules 

Let V denote a denumerable set of variables and x, y, w, z range over it. In the 
generalised multiary A-calculus AJ m there are two kinds of expressions: terms 
and lists. 

Definition 1. Terms and lists of AJ m are described in the following grammar: 

( terms of AJ m ) t,u,v ::= x | A x.t \ t(u,l, ( x)v ) 

(lists of AJ m ) l ::= t::l | [] 

The sets of AJ m -terms and X3 m -lists are denoted by AJ m and £J m respectively. 
A term construction of the form t(u , l , ( x)v ) is called a generalised multiary 
application (gin- application for short) and t is called its head. In terms Xx.v 
and t(u,l, ( x)v ), occurrences of x in v are bound. The list [] is called the empty 
list and lists of the form t :: l are called cons-lists. The notation [ui, . . . ,u n ] 
abbreviates U \ :: . . .::u n :: [] . 

Two definitions that play a special role in the following are: 



Definition 2. A gm- application is called a cut if its head is not a variable. 
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Definition 3. A variable x is main and linear in a term t if t = x or t is of the 
form x(u,l, (y)v) where x u,l,v. We write mla(a;,u) if v is a gm-application 
and x is main and linear in v. 



Formulas (= types) A , B, C, ... are built up from propositional variables 
using just D (for implication) and contexts r are finite sets of variable : formula 
pairs, associating at most one formula to each variable. 

Sequents of AJ m are of one of the following two forms 

r-,-\-t:A 

r-,B\-l:C, 

called term sequents and list sequents respectively. The distinguished position 
in the LHS of sequents is called the stoup and may either be empty (as in 
term sequents) or hold a formula (the case of list sequents). Read a list sequent 
r-,B\~l:C as “list l leads the formula B to its instance C in context T”. C is an 
instance of B if B is of the form Bi D ... D B^ D C, for some k > 0. 



Definition 4. The typing rules of AJ m are as follows: 



X'.A^r-, — b x:A 



Axiom. 



x:A,r-, — b t:B 
r : — \~Ax.t:A D B 



Right 



.T; —\~t:A D B 



r--\-u:A r-B\-l:C 
T; — b t(u, l, (x)v) : D 



x : C, T; — b v : D 



gm — Elim 



r-,c\-[]-.c 



Ax 



r--\~u:A r-B\-l:C 
r-.A D B\~u::l:C 



Lft 



with the proviso that x : A does not belong to T in Right and the proviso that 
x:C does not belong to T in gm-Elim. 



An instance of rule gm — Elim is called a generalised multiary elimination 
(or gm-elimination, for short). [2] explains in which sense these typing rules 
define a sequent calculus which extends with cuts Schwichtenberg’s multiary 
cut-free sequent calculus [6]. it also explains how to interpret AJ m in Herbelin’s 
A-calculus [3], where the key ideia is to interpret a gm-application t(u, l, ( x)v ) as 
the combination v{x := t(u :: /)} of an head-cut and a mid-cut. 
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2.2 Reduction Rules 

Definition 5. The reduction rules for AJ m are as follows: 

(fa) (A x.t)(u, [], (y)v) -> s(s (u, x, t),y, v) 

(fa) (A x.t)(u,v::l,(y)v') -> s(u,x,t)(v,l,(y)v') 

(tt) t(u, l, (. x)v)(u ', l', (y)v') -> t(u, l, (; x)v(u ', l', (y)v')) 

(/f) t(u, l, (x)x(u ' , l' , ( y)v )) t(u, append(l, u' , l'), (y)v), x ^ u', l', v 

where s(t , x,x) = t 

s (t,x,y) = y, y^x 
s(t, x, Xy.u) = Xy.s(t,x,u) 

s (t,x,u(v,l, (y) v')) = s(t,x,u)(s(t,x,v),s’(t,x,l), (y)s(t,x,v')) 
*'(t,x, []) = [] 

s '(t,x,v::l) = s (t,x,v)::s'(t,x,l) 

append( [], u, l) = u::l 
appendin' ::l',u,l) = u::append(l' ,u,l) 

A detailed motivation for the reduction rules can be found in [2]. In brief, 
rules (fa), (fa) and (n) perform cut-elimination, i.e. they aim at reducing all 
gm- applications in a term to the form where the head is a variable. Reduction 
rule (y) is structural and is used to eliminate gm-applications t(u, l, (x)v) such 
that mla(x, u). 

Consider the following grammar: 

t, u, v x | \x.t \ t'(u, l, (y)v) 
l ::= u::l \ [] 

The /?, 7r-normal forms are generated by this grammar provided t’ is a variable. 
The /i-normal forms are generated by this grammar provided that in the last 
production for terms, not mla(y, v), i.e. if v is of the form y(u', l', (y')v'), then 
y must occur either in u' , 1/ or v' . Finally /?, tt, /z-normal forms are generated by 
this grammar provided the last production satisfies the two provisos above. 

As observed in [2] subject reduction holds for — 



2.3 AJ: The Generalised A-Calculus 

We now introduce the cons-free subsystem of AJ m , called AJ. 

Definition 6. Terms and lists of AJ are as follows: 

(AJ — terms) t, u, v ::= x | A x.t \ t(u, l, (x)v) 
(AJ — lists) l ::= [] 



AJ is used to denote the set of AJ -terms. 
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Since there is only one form of lists in AJ, every gm-application in AJ is of 
the form t(u, [], ( x)v ), which we call a generalised application (or g-application, 
for short). AJ-terms can simply be described as: 

(AJ — terms) t,u,v x | A x.t \ t(u-(x)v) , 

where t(u-(x)v) is used as an abbreviation to t(u, [], (x)v). This expression can 
be typed by the derived rule (called generalised elimination) 

r- — \~t:A D B r--\-u:A x:B,r--\~v:C 

9 “ Ehm ' (1) 

with proviso x : B does not belong to r. Such rule corresponds to an instance of 
the rule gm — Elim. where the penultimate premiss is an instance of Ax. 

Definition 7. The reduction rules for AJ are as follows: 

(Pi) (A x.t)(u-(y)v) ->s(s (u,x,t),y,v) 

( 7 r) t(u-(x)v)(u' -(y)v') — > t(u- (x)v(u' • (y)v')) 

where s (t, x,x) = x 

s(t,x,y ) = y, y^x 
s(t, x, Xy.u) = Xy.s(t,x,u) 
s (t,x,u(w(y)v')) = s(t,x, u)(s(t, x, v)-(y)s(t, x, v')) 



Comparatively to AJ m , AJ drops all rules and clauses involving cons. Since 
/? 2 -redexes and /j- contractu fall outside AJ (notice that append([], u', l') is a 
cons-list), the rules (P 2 ) and (/j) are omitted. 

The system thus obtained is no more than a notational variant of the AJ- 
calculus of Joachimski and Matthes. 

3 Relating Generality and Multiarity 

Generality can express multiarity and multiarity is a shorthand for certain forms 
of generality. In this section this idea is made precise and consequences of it are 
extracted. 

3.1 The Bijection between Terms of AJ and p-Normal Forms 

We start by explaining how to express multiarity in terms of generality. The basic 
idea is to replace each cons by a g-application that introduces a fresh name. For 
instance, 

t(u, [ui,u 2 ], (x)v) t(u-(z 1 )z 1 (u 1 -(z 2 )z 2 (u 2 -(x)v))), 

where z\ and z 2 are fresh variables. This idea is embodied in the following type- 
preserving mapping. 
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Definition 8. The mapping v is as follows. 

v : A. 1 111 — » AJ 

v(x) = x 
iz(Xx.t) = A x.v(t) 

v(t(u,l,(x)v)) = v(t){v(u)\z)v\z,l,x,v(v))), z fresh 



v'{z, D,a;,u) = s(z,x,v) 

v’(z,u::l,x, v) = z(v(u)-(w)v'(w, l,x,v)), w fresh 

Conversely, in t(u,l,( x)v), if v is a gm-application x{v! ,1' ,{y)v') such that 
x u' 1 1' , v 1 , then v may be eliminated with the help of cons. In fact, the former 

term can be reduced to t(u, append^, u', l'), where the append operation 

generates u' :: l' and, if l is not empty, a further cons to concatenate l with v! :: V . 
This is precisely reduction rule p. The following type-preserving mapping reduces 
the /z-redexes of a term in a innermost-first fashion. 

Definition 9. The mapping p is as follows. 



p{x) 
p(X x.t) 



p(t(u,l , (x)v)) 



p : AJ m — > A. 1 1,1 

= X 

= A x.n(t) 

( P(i)(p(u),append(p'(l),u', l'), (y)v l ), 

I if p(v) = x(u', ( y)v ') and x u', l', v 1 



p(t)(p(u), p'(l), (x)p(v)), otherwise 



m ; (D) = D 

p'(u::l) = p(u)::p'(l) 

The results that follow show that the restriction of mapping p to AJ and the 
restriction of mapping v to //-normal forms are mutual inverses. 

Lemma 1. t— >* p(t), for all t € AJ m . 

Proof. Proved together with l— >* p'(l), for all l £ £J m , by simultaneous induc- 
tion on t and l. □ 

Lemma 2. If t— >pt', then (i) p(t) = p(t') and (ii) u(t) = v{£), for all t,t' £ 

AJ m . 

Proof, (i) is proved together with l-t^l' implies p'(l) = p'(l'), for all l , l' £ £J m , 
by simultaneous induction on t — t' and l — V . (ii) is proved together with 
l^nl' implies i/'(z,l,x,v) = v'(z, l', x, v), for all 1,1' £ CJ m and all v £ AJ, by 
simultaneous induction on t— >^t' and l—^^l'. □ 



Lemma 3. p(t) is p-normal, for all t £ AJ m . 
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Proof. Proved together with //(/) is /x-normal, for all l £ £J m , by simultaneous 
induction on t and l. □ 



Proposition 1. (i) — is confluent. 

(ii) — is strongly normalising. 

(Hi) /x(i) is the unique normal form oft w.r.t. — 
for allt.£ AJ m 

Proof, (i) follows from lemmas 1 and 2. In order to guarantee (ii), observe that 
each /x- step reduces the number of /x-redexes. (iii) results from the combination 
of lemmas 1 and 3 and confluence of — □ 



Lemma 4. v(t )— >* t, for all t £ AJ m . 

Proof. Proved together with t(u- (z)u'(z, l, x, v)) — >■* t(u, l, (x)v), for all t,u,v £ 
AJ and all l £ £J m s.t. z 0 l,v, by simultaneous induction on t and l. □ 

Corollary 1. t— >•* for all t £ AJ m . 

Proof. By Lemma 1, it suffices n{v{t)) = /x(i). From Lemma 1 (applied twice) 
and Lemma 4, v(t) reduces both to and /x(t), which are /x-normal. Thus 

by confluence, /j,(u(t)) = n{t). □ 



Proposition 2. (i) v(t) = t, for all t £ AJ. 

(ii) /x(f) = t, for all /x- normal t £ AJ m . 

Proof, (i) Follows by induction on t. (ii) Since t is /x-normal, Proposition 1 
imposes t = /x(t). □ 



Proposition 3. (i) zx(/x(t)) = t, for all t £ AJ. 

(ii) n(v(t)) = t, for all normal t £ AJ m . 

Proof, (i) From lemmas 1 and 2 we get zx(/x(t)) = i/(t), which is just t by the 
proposition above, (ii) Lemmas 1 and 4 imply reduction of v{t) to t and n(y[t)) 
respectively. Thus t and /t(z/(t)) are two /x-normal forms of v(t), which by con- 
fluence of — > fJi must be equal. □ 



3.2 Preservation of Reduction by Mappings /x and v 

Preservation of reduction /x is considered in Lemma 2. 

Lemma 5. (i) Ift^rpt', then v(t)— >pv(t'), for allt,t' £ AJ m . 

(ii) If t-¥pt? , then /x(f) — >p — >■* /x(t'), for all t,t' £ AJ m . 
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Proof, (i) is proved together with l—>pl' implies v'{z, l, x, v ) —tp v' (z, l ' , x, v ), for 
all Z, l' £ £J m and all v £ AJ, by simultaneous induction on t—*pt' and 1 —tpV. 
(ii) follows from the commutation in AJ m between — >p and — if t—*pt\ and 
2, there exists £3 such that t \— >-*£3 and t 2 -^pH. □ 

In contrast to rule (/3), one-to-one preservation of 7r-steps is problematic: 
mapping v needs several steps in AJ to simulate a single step in AJ m and map- 
ping p does not even preserve 7r-steps. These mismatches, between rule (n) and 
mappings v and /i, are an obstacle to proving confluence of AJ m along the lines 
of the proof of Theorem 5, where we lift confluence of AJ. Such proof requires 
preservation of (7r) (as well as (/3)) by mapping /t. We illustrate these mismatches 
with an example. 

Let t, u, Hi, M2, u', v be /i-normal forms in AJ, hence invariant both for /j and 
v. Consider the following three terms in AJ 

t 0 = t(u-(zi)zi(ui-(z2)z2(u 2 -(x)x)))(u'-(y)v) , 
h = t(u-(zi)zi(u 1 -(z2)z2{u2-{x)x))(u’-(y)v)) , 
t 2 = t{u-{z 1 )z 1 {u 1 -(z2)z2{u2-{x)x(u' -(;y)v)))) , 

and the corresponding /it-normal forms 

uo = M(*o) = t(u, [ui,u 2 ], (■ x)x)(u'-(y)v ) , 

u\ = = t(u-(zi)zi(ui, [u 2 ], (x)x)(u' -(y)v)) , 

U2 = ix(t 2 ) = t(u, [iti, u 2 , u'], (y)v) . 



Consider also 

vi = t(u-(zi)zi(ui, [u 2 \, {. x)x(u'-(y)v ))) , 
V2=t(u,[u 1 ,u 2 ],(x)x(u'-(y)v)) . 

Observe that v(uq) = to, v(u\) = t,\ and v(u 2 ) = v{v-i) = v(v 2 ) = t 2 - Observe 
also that there are the following reductions among these terms: 




Vi ► v 2 




Notice that uq — v 2 whereas v(uo) requires three 7r-steps to reach v{v 2 ). In 
general we have the following: 
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Lemma 6. then v(t) — >+ v(t'), for all t,t' £ AJ m . 

Proof. Proved together with l^t^l' implies v'{z, l, x, v) — v'(z, l', x, i>), for all 
l, l' £ £J m and all v £ AJ, by simultaneous induction on t— t n t' and l— >^1' . □ 

Going back to the example, observe that to~, > w ti but n(to) does not reduce 
to y(t i), it 7r-reduces to v^. However, making enough 7r-reductions from t\, one 
reaches a term {t -2 in the example) whose /x-normal form ( 1 x 2 in the example) 
is the same as the /x- normal form of V2 ■ Making enough 7r-reductions means to 
perform 7r-reductions as long as this generates 7r-redexes which hide /x-redexes. 
For instance, observe that the head of to is a yx-redex. The reduction to — t\ 
creates the 7r-redex z\ (u r( Z2)z2 (u^(x)x))(u'-(y)v) which hides in t\ the mentioned 
/x-redex. Since the reduction of this 7r-redex causes a descendent of the original /x- 
redex to reappear, we perform it. Moreover, as another /x-redex becomes hidden, 
this process continues. We introduce a new reduction rule in AJ to perform such 
sequences of 7r-reductions in a single step. 

Definition 10. The rule fn’) is the following: 

( 7 r') t(u-(x)v)(u' -(y)v') —>■ t(u-(x)@'(x,v,u',y,v')) 

where 

! x(u'-(z)@'(z,v',u,y,v)), 

if t = x(u'-(z)v') and x <£ u'.v' 
j ' * 

t(u-(y)v), otherwise 



For instance, in the example before t\ — t 2 ■ Observe that — > n >C — and that 
a term is 7r'-normal if and only if it is 7r-normal. 

We now see how the situation improved w.r.t. the preservation of 7r-steps. 

Lemma 7. (i) If t—> n t' , then v(t'), 

for all t, t! £ AJ m such that t is y-normal. 

(ii) If t—¥ n't' , then /x(t) — /x(t'), for allt,t' £ AJ. 

Proof, (ii) is proved by induction on t—> n 't'. (i) is proved together with l^-^l' 
implies v'(z,l,x,v) n'(z,l' ,x,v), for all 1,1' £ £J m and all v £ AJ, by 

simultaneous induction on t^ n t' and l—> n l'. □ 

Now we turn to some basic results about rule (77'), leading to Corollary 2, 
which shows how to perform a sequence of (/3) and ( 77 ) reductions by means of 
a sequence of (/ 3 ) and (n') reductions. The proof of confluence of the relation 
on AJ m -terms, given in Section 4, uses this transformation and the lemma 

above. 

The mapping n in the definition below is considered in [ 4 ] and produces 
7r-normal forms. 
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Definition 11. The mapping n is as follows. 



ir : AJ 



AJ 



7r(x) = X 

7t(A x.t) = \x.ir(t) 

7r (t(u-(x)v)) = @(7r(t) , n(u) , x, 7r(v)) 



where 
@(t, u, x, v) 



t'(u'-(y)@(v',u,x,v)), 
t(u-(x)v), otherwise 



if t = f(u'-(y)v') 



[4] observes that (i) f— »* 7r(t), for all t £ AJ; (ii) if t—>*t' then n (t) = 7 r(t') 
for all t,t' £ AJ (and from these two follows confluence of — (iii) — is 
strongly normalising for all terms of AJ. 

Next lemma establishes that rule (n 1 ) suffices to reduce a term to its 7r-normal 
form. 

Lemma 8. t— »*, 7r(t), for all t, £ AJ. 

Proof. Because — > V >C .— >•+ and — is terminating, — is also terminating. Let 
t' be a 7r'-normal form of t. Since t' is also a 7r-normal form, t— >■* t' and — is 
confluent, it follows that t' = -k ( t). Thus t— 7r(t). □ 

We establish now a kind of commutation between reduction — and mapping 
7r, that uses next lemma. 

Lemma 9. s(n(t), x, n(u)) — 7r(s(t, x, u)), for all t,u £ AJ. 

Proof. The proof is by induction on u. ft uses the fact that, for all t, to,uo,Vo £ 
AJ, s(t, x, @(t 0 , u 0 , y, v 0 )) ->■* @(s(t, x, t 0 ), s(t, x, u 0 ), y, s(t, x, v 0 )), proved by in- 
duction on tg □ 



Proposition 4. Ift^pu, then 7r(t)— for allt,u £ AJ. 

Proof. By induction on t—>pu. The base case uses the lemma before. □ 



Corollary 2. Ift—tp^u, then Tr(t)—>p v ,n(u), for all t,u£ AJ. 

Proof. Follows by induction on the number of steps in the reduction sequence. 
The case corresponding to a /3-step uses the proposition before and the case 
corresponding to a 7r-step uses invariance of — w.r.t. mapping n. □ 
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3.3 Two Isomorphic Subsystems of AJ m 

Some of the preservation results obtained above can be put together so that the 
bijection between /i-normal forms and terms of AJ becomes an isomorphism, 
provided those two sets of terms are equipped with appropriate reduction rela- 
tions. 

Let AJ ; denote the system obtained from AJ replacing rule (7r) by rule (n'). 
Let AJ™ denote the subsystem of /u-normal forms of AJ™ obtained by closing 
relation — >p^ for mapping /r. More precisely, in AJ™ the one step relations — >p^ 
and — are given by: 



t — >p^ t ' if ty-pt" and if = n(t"), for some t" G AJ™; 
t— > Vpi if if t—^„if and t' = for some t" G AJ™. 

Notice that in AJ™ there is no need for a /^.-reduction. 

Theorem 1. (i) t—tp t' iff v(t)^pv(t'), for all p-normal forms t,t' . 

(ii) t—> 7 t t' iff v(t) — ^ v{t ') , for all p-normal forms t,if . 

(Hi) t—^-ptf iff n(t) — 1 0u, n{t') , for all t,t' G AJ. 

(iv) ty n ’t' iff /i(t)— ) > n lx p(t'), for all t,t' G AJ. 

Proof. We just show the “only if” statements since the “if” statements follow 
from these and the fact that v and /i are mutual inverses, (i) follows from lemmas 
1, 2 and 5. (ii) follows from lemmas 1, 2 and 7. (iii) and (iv) hold by lemmas 5 
and 7 respectively. □ 

Now confluence and strong normalisation of relation — on AJ are used to 
obtain corresponding properties for AJ 7 and thus for its isomorphic system AJ™. 



Theorem 2. ~^py in AJ / is confluent. 

Proof. Assume t—>p, t\ and t—>p, t 2 - Then, since — ¥ n >Q —. >+, also t—>p ti and 
t — t’2- Using confluence of — >*p w for AJ, there exists t 3 such that t\ — t 3 and 
t-2 — t 3 . So, using Corollary 2 followed by Lemma 8, one obtains t-\ — >*p n (t 3 ) 
and t 2 ->p t „,Tr(t 3 ). □ 



Theorem 3. There is no infinite —>py -reduction starting at a typable term of 

AJ'. 

Proof. If there was, since — one could build an infinite sequence of f3, 7r- 
steps, starting at a typable term of AJ, contradicting strong normalisation of 
AJ. ' □ 
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4 Results of Confluence and Strong Normalisation 
for AJ m 

This section studies confluence and strong normalisation for the notions of re- 
duction in AJ m resulting from all possible combinations of rules (/?), (7r) and 
(/n). The proofs of confluence presented here follow one of two directions: (i) 
for notions of reduction involving only rules (/?) and (tt), arguments are simple 
extensions of those used in [4]; (ii) for notions of reduction including fi, argu- 
ments are built in a modular way, using essentially properties of presevation of 
reduction by mappings /i and u, together with confluence results for AJ. Strong 
normalisation of — for all typable terms of AJ m is obtained from the strong 
normalisation of — >-/ 3 , 7 r for AJ’s typable terms, with the help of results of preser- 
vation of reduction by mapping v. Strong normalisation of typable terms for 
all the other relations follows, since they are included in — In fact, for 
relations not involving rule (/3), strong normalisation holds for all terms. 



4.1 Confluence 

Firstly we tackle confluence of relations —> n , and — hr AJ m . The following 
definition extends Definition 11. 

Definition 12. The mapping n is as follows. 

7 r : AJ m — » AJ m 

7r(a;) = x 
7t(A x.t) = Xx.n(t) 

7r (t(u, /, (x)v)) = , n(u) , tt' ( l) , x , tt(v)) 



n'(O) = D 

7 r'(u::l) = 7 r(u) :: 7T / (Z) 
where 

( l', (y)@(v', u, l,x,v)), if t = t'(u', l ', (y)v') 

@(t , U, l , X, v) = < 

[ t(u, l, ( x)v ), otherwise 

Lemma 10. n (t) is ir-normal, for all t £ AJ m . 

Proof. Proved together with n'(l) is 7r- normal, for all l £ £J m , by simultaneous 
induction on t and l. □ 



Lemma 11. t — >•* n(t), for all t £ AJ m . 

Proof. Proved together with l—>*n'(l), for all l £ £J m , by simultaneous induc- 
tion on t and l. □ 
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Lemma 12. If t\ — ^*t 2 , then 7r(fi) = ^{tf ) , for all ti, t 2 G AJ m . 

Proof. Proved together with the fact that li—^*h implies 7r'(Zi ) = for all 

I 1 J 2 £ £J m , by simultaneous induction on t—>*t' and □ 

Proposition 5. — >* has the triangle property w.r.t. mapping tv . 1 

Proof. If ti~t*t 2 , from the two lemmas above, t 2 — >^7r(t2) = 7r(fi). □ 



Definition 13. Reduction =>p is inductively defined on terms of AJ m as follows: 
x^px; 

Ax.t=>pAx.t' ift=>pt'; 

t(u , l, (x)v) =$>pt'(u ' , V , (x)v') if t=>pt' , u^pu' , l =>pl', v=>pv' ; 

(Ay.t)(u, [], (x)v)=>ps(s(u' ,y,t’),x,v') if t=>pt',u =>p u',v=> p v'; 

(A y.t)(u, u 0 :: l, (x)v) =>ps(u', y, t')(u' 0 , l {x)v') if 

t=^pt' ,u==>pu' , 'Ug =$-p u'q , l ==>p l ' , v =5>p v' ; 



u : : l =T-p u' : : V if u=^pu' , l =^pl' . 

Observe that =$-p is reflexive and — l^C^gC— 

Definition 14. The mapping (3 is as follows. 

P : AJ m — » AJ m 

x@ = x 

(A x.t) 13 = Ax.t& 

[ s(s (u^,y,tf),x,v^), 

if t = Ay.ti and l = [] 
s {uP,y,tf)(u(,lf , (x)vP), 

if t = Ay.ti and l = U\:: l\ 
l iP{ u@,lP ,(x)v@), otherwise 



t(u,l, (x)v)P = < 



If = D 

{u::lf =u f3 ::l /3 ' 



Proposition 6 . =>p has the triangle property w.r.t. /3. 

Proof. By induction on =>p. It uses parallelism of =>p, i.e. the fact that if t^>pt' 
and u^pu' then s (t,x,u) =$>ps{t' ,x,u'), as well as simple inversion principles 
for =>p. □ 

Lemma 13. If t =^p t\ and t — t 2 , then t\ — >•* 1 3 and t .2 =>p 1 3 , for some 

t 3 e AJ m . 



1 A relation — > has the triangle property w.r.t. a function / if a — > b implies b — > f(a) 
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Proof. Proved together with the fact that if l=>ph and l—> n l 2, then there exists 
I 3 £ £J m such that Zi — >■* Z3 and h=^ph, for all l,h,h G £J m , by simultaneous 
induction on t=>pti and l=^pl ± . This proof uses parallelism of — >*. □ 

Corollary 3. =>p and — >% commute. 

Proof. Follows from the previous lemma by a simple diagram chase. □ 



Proposition 7. =>p — >•* has the triangle property w.r.t. nop. 

Proof. Follows from the triangle properties of — and =$-p w.r.t. n and p, to- 
gether with commutativity between the two relations. □ 



Theorem 4. — > jr , —>p and —*p tW are confluent. 



Proof. Confluence of a relation can be obtained from a triangle property, as 
shown in Lemma 1 of [4]. (Confluence of — can also be obtained immediately 



from lemmas 11 and 12.) As to confluence of — >p <v , observe that 
and that the reflexive and transitive closure of =>p 
-*-0,nQ=>p-*lQ-*p iW - 



0, TV 

is equal to 



is confluent 



? /3,7T' 



since 

□ 



Now we consider confluence in the presence of rule pi. The method used before 
still works when one adjoins rule //, because: (i) — >•* M has a triangle property 
(w.r.t. p o 7r); and (ii) — >* commutes with =>p. However, in the presence of rule 
p, one can lift confluence results of AJ. 



Theorem 5. — tp, n ,ti> —>p,fj, and — are confluent. 

Proof. Let R be relation p (resp. n or p\Jn) and let R' be p (resp. n' or pun’). 
Assume t — >* R ^ t\ and t — >* R t 2 . Then, by lemmas 2, 5 and 6 it follows that 

u(t) — >* R v{t\) and u{t) —>* R i/(t 2 ). Now confluence of R in AJ guarantees the 
existence of t 3 such that v{ti) ~^ R t 3 and v(t 2 ) — 3. So, using Lemma 8 and 
Corollary 2, v{t\) n(t 3 ) and v(t. 2 ) ^(t 3 ), which in turn, by lemmas 5 

and 7, implies p{v{t\)) — >* R p(n(t 3 )) and p(v{t 2 )) — /z(7t(Z 3 )). Then, from 

Corollary 1, it follows t\ — >* p{v{t\)) and t 2 —^^p(u(t 2 )) and thus t\ and t 2 have 
p(n(t 3 )) as common reduct. □ 



4.2 Strong Normalisation 

Theorem 6. There is no infinite ~^p^ ^-reduction sequence starting at a typable 
term of AJ m . 

Proof. Suppose there is such an infinite reduction sequence S. It cannot contain 
infinitely many /?, 7r-steps. Otherwise, since (i) /z-reduction is invariant under v 
(Lemma 2), (ii) each P, 7r-step in AJ m originates under v one or more p, 7r-steps 
in AJ (lemmas 5 and 6) and (iii) v preserves typability, one could build in AJ an 
infinite sequence of P, 7r-steps starting at a typable term, contradicting strong 
normalisation of AJ. Therefore beyond a certain point in sequence S there are 
solely /n-steps, necessarily in infinite number, which is also impossible due to 
strong normalisation of — (Proposition 1). □ 
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Theorem 7. There is no infinite ^-reduction sequence in AJ m . 

Proof. Similar to the one above showing strong normalisation of — Addi- 
tionally, one just needs to observe that — in AJ is strongly normalising. □ 

5 Conclusion 

This work shows that the reduction relations of AJ m enjoy strong normalisation 
of typable terms and confluence. As such AJ m is a well-behaved extension of 
the A-calculus and we intend to explore its potential in functional programming. 
On the other hand, as shown in [2], AJ m captures as subsystems, not only the 
system AJ of Joachimski and Matthes, but also the multiary A-calculus XPh 
[1], as well as a notational variant of A-calculus. So, we consider AJ m a useful 
tool for the computational interpretation of successively stronger fragments of 
sequent calculus, deserving further study in this direction. 

Our investigations of the relationship between generality and multiarity iden- 
tify two isomorphic subsystems of AJ m : (i) a variant of AJ, which is the subsys- 
tem with minimal use of multiarity (i.e. no use); (ii) the subsystem of /n-normal 
forms, which is the subsystem with maximal use of multiarity (i.e. uses cons 
for expressing generality whenever possible). Think of t. € AJ and of all its /i- 
reduction sequences, leading to /i(f). In a sense, all the terms involved in these 
reduction sequences are representations of the same term, ranging from the term 
t with minimal use of multiarity to the term fi(f) with maximal use of multiarity, 
going through intermediate terms that do not belong to the subsystems: t and 
/i(f) are canonical representations whereas the intermediate terms are a redun- 
dancy allowed in AJ m . Thus the two isomorphic subsystems are non-redundant 
opposite extremes w.r.t. the use of multiarity. 

However both subsystems have shortcomings because of this extreme 
nature. In the former, multiarity is not available as a shorthand. In the latter, 
it is a simple definition of expressions and reduction that is not available, 
because unconstrained gm-application, as well as /3- and 7r-reduction, can create 
/n-redexes, i.e. do not preserve maximal multiarity. Although exhibiting some 
redundancy, AJ m does not suffer from the drawbacks of these subsystems. 
Therefore it seems to be the system with the right use of multiarity. 

Acknowledgment. Diagram in Subsection 3.2 was produced with Paul Taylor’s 
macros. 
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Abstract. We set out to study the consequences of the assumption of 
types of wellfounded trees in dependent type theories. We do so by in- 
vestigating the categorical notion of wellfounded tree introduced in [16]. 
Our main result shows that wellfounded trees allow us to define initial 
algebras for a wide class of endofunctors on locally cartesian closed cat- 
egories. 



1 Introduction 

Types of wellfounded trees, or W-types, are one of the most important com- 
ponents of Martin-Lof’s dependent type theories. First, they allow us to define 
a wide class of inductive types [5,15]. Secondly, they play an essential role in 
the interpretation of constructive set theories in dependent type theories [3]. 
Finally, from the proof-theoretic point of view, they represent the paradigmatic 
example of a generalised inductive definition and contribute considerably to the 
proof-theoretic strength of dependent type theories [8] . 

In [16] a categorical counterpart of the notion of W-type was introduced. In 
a locally cartesian closed category, W-types are defined as the initial algebras for 
endofunctors of a special kind, to which we shall refer here as polynomial functors. 
The purpose of this paper is to study polynomial endofunctors and W-types 
more closely. In particular, we set out to explore some of the consequences of the 
assumption that a locally cartesian closed category has W-types, i.e. that every 
polynomial endofunctor has an initial algebra. To explore these consequences we 
introduce dependent polynomial functors, that generalize polynomial functors. 

Our main theorem then shows that the assumption of W-types is sufficient 
to define explicitly initial algebras for dependent polynomial functors. We ex- 
pect this result to lead to further insight into the interplay between dependent 
type theory and the theory of inductive definitions. In this paper, we will limit 
ourselves to giving only two applications of our main theorem. First, we show 
how the class of polynomial functors is closed under fixpoints. We hasten to 
point out that related results appeared in [1,2]. One of our original goals was 
indeed to put those results in a more general context and simplify their proofs. 
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Secondly, we show how polynomial functors have free monads, and these free 
monads are themselves polynomial. The combination of these two facts leads 
to further observations concerning the categories of algebras of polynomial end- 
ofunctors. These results are relevant for our ongoing research on 2-categorical 
models of the differential A-calculus [6] . 

The interplay between dependent type theories and categories is here ex- 
ploited twice. On the one hand, category theory provides a mathematically effi- 
cient setting to present results that apply not only to the categories arising from 
the syntax of dependent type theories, but also to the categories providing their 
models. On the other hand, dependent type theories provide a convenient lan- 
guage to manipulate and describe the objects and the arrows of locally cartesian 
closed categories via the internal language of such a category [18]. 

In order to set up the internal language for a locally cartesian closed category 
with W-types, it is necessary to establish some technical results that ensure a 
correct interaction between the structural rules of the internal language and the 
rules for W-types. Although these results are already contained in [16] we give 
new and simpler proofs of some of them. Once this is achieved, we can freely 
exploit the internal language to prove the consequences of the assumption of 
W-types in a category. 

2 Polynomial Functors 

2.1 Locally Cartesian Closed Categories 

We say that a category C is a locally cartesian closed category, or a lccc for short, 
if for every object I of C the slice category C/I is cartesian closed 1 . Note that 
if C is a lccc then so are all its slices. For an arrow / : B — > A in a lccc C 
we write Af : C/A C/B for the associated pullback functor, which can be 
defined since slice categories have cartesian products. The key fact about locally 
cartesian closed categories is the following proposition [7]. 

Proposition 1. Let C be a lccc. For any arrow f : B — >• A in C, the pullback 
functor Af : C/A C/B has both a left, and a right adjoint. 

Given an arrow / : B — > A in a lccc C , we will write Ef : C/B — »• C/A 
and Ilf : C/B — >• C/A for the left and right adjoint to the pullback functor, 
respectively. We indicate the existing adjunctions as E f -\ Af -\ II f . 

An abuse of language. For an arrow f : B —> A, we write the image of X — > A 
in C/A under Af as Af(X) — > B. These arrows fit into the pullback diagram 

A f X >- x 

f » 

B — - — ^A 

1 Here and in the following, when we require the existence of some structure in a 
category, we always mean that this structure is given to us by an explicitly defined 
operation. 
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The Beck- Chev alley condition. The Beck-Chevalley condition, which holds in 
any lccc, expresses categorically that substitution behaves correctly with respect 
to type-formation rules. More precisely, it asserts that for a pullback diagram of 
the form 



D 



k 



B 




the canonical natural transformations E g Ap. => A/, Ef and Ap, Ilf => 77 g Ap. 
are isomorphisms. 

The axiom of choice. The type-theoretic axiom of choice [15] is expressed by 
the fact that, for two arrows g : C —> B and / : B — > A, the canonical natural 
transformation Ep, II p A ec => Ilf E g is an isomorphism, where 



AfllfC — 7T/C 



1 

B 



f 



r 

A 



is a pullback diagram and Sc ■ Af Ilf C —1 C is a component of the counit of 
the adjunction Af H Ilf . 



2.2 Internal Language 

Associated to a lccc C there is a dependent type theory Th(C) to which we shall 
refer as the internal language of C . A complete presentation of such a dependent 
type theory can be found in [9,14,18]. We limit ourselves to recalling only those 
aspects that are most relevant for the remainder of this work. The standard 
judgement forms, written here as 

( B a | a £ A) , (7 3 a = B' a \ a £ A) , ( b a £ B a \ a £ A) , ( b a = h' a £ B a \ a £ A), 

are assumed to have their usual meaning [15]. The dependent type theory Th(C) 
has the following primitive forms of type: 

1, Id A (a,a ') , ^2 B a , B a . 

a(zA a£A 

We refer to these as the unit, identity, dependent sum and dependent product 
types, respectively. As usual, these primitive forms of type allow us to define the 
forms of type A x B and B A , to which we refer as the product and function 
types. The dependent type theory Th(C) has a straightforward interpretation 
in C and thus provides a convenient language to define objects and arrows in C . 
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2.3 Polynomial Functors 

Let C be a lccc. For an object / of C, we write / also for the unique arrow 
I — > 1 into the terminal object 1 of C . Observe this arrow determines functors 
A i : C —> C/I and Sj : C/I —t C. We are now ready to introduce polynomial 
functors. For an arrow / : B — > A in C, we define a functor Vf : C — > C, called 
the generalized polynomial functor associated to f, as the composite 

C —^C/B ——^C/A c ( 1 ) 



Definition 2. We say that P : C — >■ C is a generalized polynomial functor if it 
is naturally isomorphic to a functor Vf : C — > C defined as Vf =df Ilf Ab, 
for some arrow / : B —t A of C. 

Note. To avoid clashes with the existing terminology, we adopted the name 
generalised polynomial functors. This is in analogy with the distinction between 
generalised inductive definitions and ordinary ones. Since in this paper we only 
consider polynomial functors in the generalised sense, we refer to them simply 
as polynomial functors. 



Let us look more closely at the definition of polynomial endofunctors. For an 
arrow / : B — > A , we have the two functors £a '■ C/A—f C and A b : C — > C/B. 
The functor Ag takes an object X of C to the left-hand side of the pullback 
diagram: 



X x B 

I x 



We can therefore write AbX = X x B . The action of Ea is very simple: given 
an object Y — > A of C/A we have Ea{Y — > A) = Y . These observations lead 
to a description of polynomial functors in the internal language, which we shall 
exploit. The object / : B -A A of Cl 'A determines the judgement ( B a \ a € A) 
of Th(C). We can then explicitly define in Th(C) 

Vf(X) = df £ X s - , 

a&A 

for a type X. The interpretation in C of the right-hand side of the definition is 
indeed Vf(X) , as defined in (1). 



2.4 Basic Properties of Polynomial Functors 

Proposition 3. The composition of two polynomial functors is polynomial. 
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Proof. A proof using the internal language is in [2] , but it is also possible to give 
a diagrammatic one. In either case one uses crucially the axiom of choice. □ 

We now assume that the lccc C has finite disjoint coproducts. As pullback 
functors are left adjoints, these finite coproducts are preserved under pullbacks 
and C has stable disjoint coproducts. They can be represented in a familiar way 
in the internal language Th(C), which now has also the primitive forms of type 
0 and A + B called the empty and disjoint sum types, respectively. 

The class of polynomial functors is closed under a further operation, that 
will be very important in the following. To discuss it, let us introduce a family 
of functors Px ■ C — > C, for A in C, associated to a functor P : C — > C. First of 
all, observe that the function mapping (A, Y) into A + PY can be extended to 
a bifunctor C x C —> C. This determines a functor C —> End(C) mapping X into 
Px , whose action on Y in C is defined by letting Px(Y) = X + PY. 

Proposition 4. Let P : C C be a functor and X be an object of C. If P is 
polynomial then so is Px- 

Proof. We give a proof using the internal language. Let / : B — x A and consider 
the polynomial functor Vf associated to / . For X and Y in C we then have 

x +p f (Y) = x +j2 yBa - yBz 

o£.A z£X-\-A 

where (B z \ z £ X + A) is defined so that the judgements (B tl ( x ) = 0 | x £ A) 
and {B L2(a) = B a \ a £ A) are derivable. □ 

To recall the notions of strength for a functor, let us consider a monoidal 
category (C,®,I,a,l,r), where / is the unit object and a,l,r are natural iso- 
morphisms giving the associativity, left and right unit laws and satisfying the 
monoidal coherence axioms [12]. We can regard a lccc C as a monoidal category 
where cartesian product is the tensor, and the terminal object is the unit. 

Definition 5. Let P : C — > C be a functor. By a strength for P we mean a 
natural transformation cr with components ux,y '■ X ® PY P( A ® Y), for A 
and Y in C, such that for all A, Y. Z in C the following equations hold: 

P(lx) ° o’xj = lx , P(r Y ) ° cti,y = r Y , (?x,y®z ° (lx ® &y,z) = &xxy,z • 

Proposition 6. Every polynomial functor has a strength. 

Proof. Let us use the internal language to define the arrow 
a x , Y ■■ X x V f Y ->P/(Ax Y) 

which gives us one of the components of the required strength a for a polynomial 
functor Vf. First, observe that the domain and the codomain of <jx,y can be 
described in Th(C) as A x Y Ba and EaeA (A x Y) Ba respectively. We 

can then define <Jx,y by letting ax, Y {x,a,t) =df (a,(Xb £ B a )(x,t(b))) for 
0,a,t) G A x • D 
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3 Change of Base 

In the following, we shall be interested in the effect that pullback functors have 
on algebras for polynomial endofunctors. Let us first recall some basic definitions. 
Let P : C — > C be an endofunctor on a category C . An algebra for P, or a P- 
algebra, is a diagram of the form x : PX X in C. An arrow of P-algebras 
from PX — > X to PY — > Y is given by a commuting diagram of form 



Pf 

PX — > PY 




There is then a manifest category P-alg of P-algebras and P-algebra arrows. We 
write U : P-alg — > C . for the obvious forgetful functor. 

In the following, we will work in a lccc C. For an arrow u : I J in C we 
will show that the algebras for the polynomial functor Vf on C/J associated to 
an arrow / of C/J can be mapped functorially into algebras for the polynomial 
endofunctor V on C/J associated to the arrow A u (f) of C/I . As we will 
see, this is a purely formal consequence of some observations concerning the 2- 
category of polynomial functors, that we define below. The treatment is inspired 
by the formal theory of monads [10,19]. 

3.1 The 2- Category of Polynomial Functors 

Let us define the 2-category Poly. An object of Poly is a pair (C,Vf) where C 
is a lccc and P/ is the polynomial endofunctor on C associated to an arrow 
/ : B — > A in C. A 1-cell with domain (C,P/) and codomain (T>,V g ) is given by 
a pair (P, <j>) where F : C — > V is a functor and 4> : V g F =$• F Vf, is a natural 
transformation, usually drawn in a diagram of form 



C 




The 2-cells of Poly are defined exactly as the ones in 2-categories of monads [19]. 
We can now define a 2-functor Alg : Poly — > Cat, but for the purposes of this 
paper, it is sufficient to give the definition of its action on objects and 1-cells. 
For an object (C,Vf) of Poly we define 



Alg(C,P/) =dt Vf-alg. 
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Given a 1-cell {F,<j>) : ( C,Vf ) — > (V,V g ) in Poly, the functor Alg(F, <fi) is defined 
by mapping a Vf - algebra x : Vf X — > X into the the composite 

Pg FX " v > FV f X?* >/.\Y 



that is a P g -algebra. 

3.2 Pullback of Algebras 

By a locally cartesian closed functor, or lccc functor, we mean a functor that 
preserves the locally cartesian closed structure up to isomorphism. The next 
proposition is a simple but useful fact. 

Proposition 7. Let C and V be Iccc’s, and let F : C — i V be a lccc functor. For 
any arrow f : B A there is a natural isomorphism 

Xf • Vf/ F => F Vf 

such that the 1-cell (F,xf) '■ (C,P/) — > ( D,Vfj ) determines a commuting dia- 
gram of form 



Vf-alg ^ C 



Alg(F, XJ ,)| 

VFf-alg 



F 



V 



where the horizontal arrows are the forgetful functors. 

Proof. For an arrow / : B — > A the required natural isomorphism \f is obtained 
by pasting the three isomorphims in the diagram 




n Ff 



Ufa 



where for an object I, we write F/I : C/I —1 V/FI for the obvious functor 
induced by F . The isomorphisms in the diagram exist since F is a lccc functor. 
The rest of the proof follows by direct calculation. □ 

We can apply Proposition 7 to pullback functors, as they are lccc functors [7]. 

Corollary 8. Let C be a lccc. Let I be an object of C. For an arrow f : B — >■ A 
in C there is a natural isomorphism XAj ■ V g Aj =>■ Aj Vf, where g =df Ajf. 
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4 Wellfounded Trees 

Definition 9. We say that a lccc C has W-types if for every arrow / : B — ► A 
in C there is a diagram VfiWf) —1 W/ which is an initial algebra for Vf : C — > C . 

Recall that, by a theorem of Lambck, the arrow Vf(Wf) —> W/ is an iso- 
morphism. Once the internal language of a lccc with W-types is set up, we will 
therefore be allowed to write 



w =* wBa 

aeA 

where / : B — »• A and W =df W/ . The next subsection is devoted to justify the 
use of the internal language in connection to W-types. 



4.1 Pullback of Wellfounded Trees 

In [16] it is proved that if C has W-types then so do all its slices. A proof of 
this fact can be obtained by defining explicitly initial algebras for polynomial 
endofunctors on the slice categories. It is also observed there that the pullback 
functors preserve W-types. Although in [16] it is suggested to prove this second 
fact using the explicit definition of W-types in slice categories, we give a new 
and more direct proof of this fact. 

Let C be a lccc and let 7 be an object in C. Recall from Corollary 8 that there is 
a natural isomorphism XAj ■ V g Aj => Aj Vf where g =df Aj(f). This natural 
transformation determines a functor Fj : Vf-alg —> V g -alg defined as Fj = c ]f 
Alg(Z\/, Xa/)- We now use the inverse to XA r , given by a natural transformation 
ip : Aj Vf => V g Ai, to define a functor G/ : V g -alg -A- Vf-alg that is right 
adjoint to Ff . First of all, observe that ip gives us a natural transformation 
£ : Vf II I => IIi V g that is defined as the composite 



Vf n / 



r\Vf Tli 



> TT t AtVp TT t 



III "0 III 



where ?y and £ are the unit and the counit of the adjunction Ai H 77/ , respec- 
tively. Hence we have that (77/, £) : (C/ I,V g ) —> (C,Vf) is a 1-cell in Poly and 
thus we can simply define G/ =df Alg(77/,£). 

Theorem 10. Let C be a lccc and let f : B — » A be an arrow in C . For any 
object I of C the adjunction Aj H 77/ lifts to an adjunction Fj H G/ , i.e. in the 
diagram 



V g -alg 



A 

Fi 



Gj 

Y 



Vf-alg 



C/I 



A: 



I 



n z 

V 



C 



where g =df Aj{f ) , the inner and outer squares commute. 
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The functor G/ can be described also in the internal language. Let us consider 
a 'Pg-algebra, i.e. an arrow 2 : V g Z — > Z in C/I . This arrow determines the 
judgement 



( z(i, (a, s)) G Zi | i G I , (a, s) £ ^ Zi Ba ) 

aSA 

where ( Z. t | * e I) is the judgement associated to the object Z — > I of C/I . We 
can then derive the judgement 

( (A* G I ) z(i, (a, (A b G B a ) t(b, *))) ejjz, \ (a, t) G ^ [J Z^) 

iei aeA iei 

which gives us a ^/-algebra PflljZ IIjZ . This is exactly the image under G/ 
of the ^/-algebra V g Z -A Z. A proof of Theorem 10 can then be obtained either 
reasoning with diagrams or with the internal language. We can now derive a 
simple proof of the pullback stability for W- types. 

Corollary 11 (Pullbacks preserve W-types). Let C be a lecc. Let u : I —> J 

be an arrow in C. For objects B — » J and A — ► J in C/ J and an arrow f : B A 
between them, there is an isomorphism Wa „(/) — A u Wf. 

Proof. Note that without loss of generality we can assume that J is the terminal 
object of C and thus consider the pullback functors Aj : C C / 1 determined by 

I : I — > 1. But now it suffices to appy Theorem 10 and observe that Fj, defined 
as Fj =df Alg (Aj,xa t ), preserves initial objects because it is a left adjoint. □ 



5 Dependent Polynomial Functors 

We can now pick up the fruits of the work done in the last section and exploit 
freely the internal language to prove further consequences of the assumption 
of the existence of W-types in a lccc. Here we show how W-types can be used 
to define initial algebras for a class of functors that is wider than the one of 
polynomial functors. Let C be a lccc. For a diagram, which we do not assume to 
be commuting, of form 




(2) 



we define V : C/I — > C/I, called the dependent polynomial endofunctor associ- 
ated to the diagram, as the composite 







C/I 



C/I 



C/B 
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We can describe the action of V on an object (X,. \ i £ I) of C/I by letting 

V(Xi | i € J) = df ( £) I[ x sb\ i€l) (3) 

aeAi b£B a 

for an object (X* | i £ I) of C/I. Using this description, we can observe that 
initial algebras for dependent polynomial functors are categorical counterparts 
of the so-called general trees of Martin-Lof type theory, as described in [17, 
Chapter 16]. We now give some examples of dependent polynomial functors. 

(i) Polynomial functors on slice categories are special examples of dependent 
polynomial functors. Observe that, if the diagram in (2) comminutes, then 
the formula in (3) simplifies to 

v(x t | i e I) = ( £ xf« | * e J) . 

a£Ai 

(ii) Let / : B — )■ A be an arrow in C and define W =df W/. For our applications, 
it is useful to observe that, for an arrow g : C -A B, the endofunctor 
F : C /W -A C /W defined in the internal language by letting 

F (x (o t) | (a,t) £ W) = df ( ^ Xg } | (o,t) £ W 

b£B a 

is a dependent polynomial functor. Indeed, it is naturally isomorphic to 
the functor associated to the diagram 



Y^(a,t)£W Yhb£B a ^b 



■ Y!(a,t)ew 




where p(a,t,b,c) =df (a, t, b), s(a,t,b,c) =df t(b) and r(a,t,b) =df (a, t) 
for (a,t) £ W, b £ B a , c £ Cb- 

(iii) If C lias finite disjoint coproducts, the coproduct of two dependent polyno- 
mial functors is still a dependent polynomial functor. For two dependent 
polynomial functors V \ , V 2 associated respectively to the two diagrams 



B , 



/ 1 



■Ai 



Bo 



h 



■ A? 







the functor T>i + 2? 2 is naturally isomorphic to the dependent polynomial 
functor associated to the diagram 



B 1 + B2 



/1+/2 



A\ + A2 
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5.1 Initial Algebras 

We want to prove our first main result. We assume that the lccc C has W-types. 

Theorem 12. Every dependent polynomial functor has an initial algebra. 

The proof involves a generalisation of the argument showing that W-types exist 
in slice categories [16, Proposition 3.8], a result that follows indeed as a corollary 
of our theorem. We begin by constructing a candidate to be the initial algebra 
for the dependent polynomial functor defined in (3). Let us consider the W- types 
W/ and W/x/ associated to / : B — > A and fxlj-.BxI^AxI. The canonical 
isomorphisms 



w f = E w f a - w fxI = 

a€A a£A 

will be treated here as equalities to simplify the presentation. Let us recall that 
there is an arrow p : W/ — > A defined by letting p(a,t) =df a for (a,t) € Wf. 

The strategy to define the candidate V —> I to be an initial algebra will be 
as follows. First, we will define V as the object fitting in the equalizer diagram 

V ">W., i ;:W /X , (4) 

determined by appropriate arrows £ and f. Secondly, the required object of C/J 
can then be defined as r p rj : V — > I. It now remains to define the arrows £, £f . 
The arrow £ is defined by recursion on Wf by letting, for (a,t) G Wf, 

€(a, t ) = d f (ra, a, (A b € B a ) £( tb )) . 

The definition of £' is more involved. First, we define <j> : Wf X i x B -)■ W fxI by 
recursion. For ( i,a,t,b ) £ W/x/, define 

<j>(i,a,t,b) = d f (sb, a, (Xb' £ B a ) <t>(t(b'), b')) . 

Then, we define if : Wf X i — ¥ W/x/ by letting, for (■ i,a,t ) £ W/ x /, 

ip(i, a, t ) = d f (i, a , t, (A b £ B a )(j>(tb, b )) . 

Finally, we fix f =df ip £■ The key property of the object V that allows us to 
prove Theorem 12 is stated in the next lemma. 

Lemma 13. For all ( a,t ) £ Wf, we have ( a,t ) £ ^2 aeA Tlbes an< ^ on ^ 

if ( a,t ) £ Vi, where i =df ra. 

Proof. Let (a,t) G W/ and define i =df ra. First of all one needs to show that, 
for all b £ B a 



tb ) = <p(£(tb),b ) <£=> sb = p(tb) A £{tb) = ip £(tb) . 



( 5 ) 
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Lemma 13 shows that V — > I can be equipped with a structure map, and 
thus gives us an algebra for the dependent polynomial functor defined in (3). 
The initiality of this algebra follows from reasoning that is completely analogous 
to that in [16, Proposition 3.8] and hence is omitted here. 



5.2 Applications 

We give a first application of Theorem 12. Let us consider two arrows / : B — >■ A 
and g : D — > C in a lccc C with W-types. We can then define a bifunctor 
F : C x C — ► C whose action on an object ( X , Y) is defined by letting 

F(X,Y) = df V f (X)xV g (Y). 

For a fixed object A of C , the functor F x : C —1 C that maps Y into F(X, Y) 
can easily be seen to be polynomial. It therefore has an initial algebra, that we 
denote as 

F x ( nY.F(X, Y) ) ^ /iY.F(X, Y) 

The assignment of gY.F(X : Y) to X can then be extended to a functor C — > C . 
We refer to these functors as fixpoint functors. We can now state our second 
main result. 

Theorem 14. Fixpoint functors are polynomial. 

Proof. We limit ourselves to sketch the main idea of the argument. Let us actu- 
ally suppose that the fixpoint functor is polynomial, and let Q — > P be an arrow 
in C such that 

pY.F{X , Y) =* xQp ■ 

p€P 

Direct calculations imply that there must be isomorphisms 

P^AxJ2 pDc , 

cec 

Q(a,c,t) — T 'y ] Qt.(d) 
ddD c 
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for (a, c, t) £ Ax X^ c eC P° c ■ The first isomorphism certainly holds if we define P 
as the W-type of the arrow g x \a '■ D x A ^ C x A and use Corollary 11. 
Theorem 12 shows that it is also possible to satisfy the second isomorphism by 
defining Q — > P as the initial algebra for an appropriate dependent polynomial 
functor. Recalling the examples of dependent polynomial functors given earlier 
in this section, it is immediate to observe that the functor F : C/P — > C/P 
defined by letting 

F{X( a ,c,t) I € P) =df B a + ^ X t (d) 

d,£D c 

for (X( 0jCj i) | ( a,c,t ) € P) in C/P, is a dependent polynomial functors, since it 
is the sum of two such functors. □ 



6 Free Monads 

6.1 Background 

We review some facts concerning endofunctors and monads, and some results 
concerning free monads. More details can be found in [4,11]. 

Definition 15. Let P be an endofunctor on C . We say that P has a free monad 
if the forgetful functor U : P-alg —> C has a left adjoint. 

The next proposition shows that the existence of a free monad for an endo- 
functor is a necessary and sufficient condition for its category of algebras to be 
isomorphic to a category of algebras for a monad. 

Proposition 16. The forgetful functor U : P-alg —> C has a left adjoint if and 
only if it is monadic over C. 

Proof. The proof is an application of Beck’s theorem [13] characterising monadic 
adjunctions. One should observe that the functor U satisfies all the hypothesis 
of Beck’s theorem except for the existence of a left adjoint. □ 

When is a monad on a category C we write T-Alg for the usual 

category of T-algebras. Note that we follow a suggestion of Peter Freyd in using 
P-alg for the algebras of an endofunctor P and T-Alg for the algebras for a 
monad T. Again, we write U : T -Alg — > C for the forgetful functor. We can then 
restate Proposition 16 as follows. 

Proposition 17. (T,rj,fT) is a free monad for P if and only if there is an equiv- 
alence T-Alg P-alg such that the following diagram commutes 



T-Alg 



P-alg 
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We wish to give a more concrete description of the free monad for an end- 
ofunctor on a locally cartesian closed category with coproducts. To do so, we 
use the family of functors Px : C — ► C, for X in C, associated to a functor 
P : C — » C and defined in Subsection 2.4. As we did in the discussion leading to 
Proposition 4, we assume that C has finite disjoint coproducts. 

Proposition 18. Let P be an endofunctor on C. The following are equivalent: 

(i) the endofunctor P has a free monad; 

(ii) the comma category X fU has an initial object, for all X in C; 

(Hi) the endofunctor Px has an initial algebra, for all X in C. 

Proof. The equivalence (i) <t=> (ii) follows by Definition 15 and by the possibility 
of determining a left adjoint via initial objects in comma categories [13]. The 
equivalence (ii) <t=> (iii) follows from the isomorphism X ( 17 = Px~alg. One 
could also verify directly the implication (iii) => (i) by defining explicitly the 
free monad (T, rj,n) for P. The functor T is defined by letting T(X) be the 
initial algebra for Px, for A in C. □ 

We conclude this review by recalling the notion of strenth for a monad and 
a simple fact about it. 

Definition 19. Let (T, ry, /x) be a monad on C. By a strength for (T, 77 , /z) we 
mean a strength a for the functor T such that, for all X and Y in C, we have 

&X,Y 0 (lx ® V y) = V X®Y , LX®Y 0 T((Jx,y) 0 &X.TY = &X,Y 0 (l.Y ® Hy) • 

Proposition 20. Let P be an endofunctor on C and (T,rj,n) be the free monad 
on P. A strength for the functor P determines a strength for the monad (T, 77 , p) . 

Proof. The strength can be defined using the explicit description of the free 
monad given in the proof of Proposition 18. □ 

6.2 Free Monads for Polynomial Functors 

We begin by ensuring the existence of free monads for polynomial functors. 

Theorem 21. If C is a Iccc with finite disjoint coproducts and W-types, then 
every polynomial endofunctor on C has a free monad. 

Proof. Let P : C — > C be a polynomial functor. If we knew that for every X 
in C the functor Px : C — > C had an initial algebra, then we could invoke 
Proposition 18 and conclude the desired claim. By Proposition 4, however, the 
functors Px : C C, for X in C, are polynomial, and therefore they have an 
initial algebra by the assumption that C has W-types. □ 

The next corollary, a consequence of Proposition 16 and Theorem 21, allows 
us to observe the existence of structure on the categories of algebras for poly- 
nomial functors. From now on, we assume that C is a lccc with finite disjoint 
coproducts and W-types. 
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Corollary 22. For every polynomial functor P on C, the category P-alg is iso- 
morphic to the category T-Alg , where T is free monad on P. 

We can also derive information on free monads for polynomial functors. 

Proposition 23. Free monads for polynomial functors have a strength. 

Proof. The claim is a consequence of Proposition 6 and Proposition 20. □ 

We conclude the paper with our third and last main result, whose proof is 
completely analogous to that of Theorem 14. 

Theorem 24. The free monad on a polynomial functor is polynomial. 
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Abstract. Curien and Herbelin provided a Curry and Howard correspondence 
between classical propositional logic and a computational model called Xpp which 
is a calculus for interpreting classical sequents. A new terminology for A pji in 
terms of pairs of callers-callees which we name capsules enlightens a natural 
link between Xpfj, and process calculi. In this paper we propose an intersection 
type system Xpfi n which is an extension of Xpjl with intersection types. We prove 
that all strongly normalizing A/r/i-terms are typeable in the new system, which 
was not the case in A pp. Also, we prove that all typeable p-fvee terms are strongly 
normalizing. 



1 Introduction 

In this paper we study A/i/I, a type assignment system designed by Curien and Herbelin 
[8,12] which gives a computational content to classical logic. It deals with interactions 
and therefore it seems to be well suited as a process language. Our main concerns 
are the type-free Xpp-calculus, the untyped calculus underlying Xpp, as well as the 
intersection type system X pjl r , which is an extension of Xpp with intersection types. 
The main components of type-free Xpp are capsules in which two entities interact, one 
named caller performs basically one of two actions, it either gets data from another entity 
named callee or asks the callee to be its continuation. A callee can ask the caller to take 
the place of one of its specified internal caller variables. These components are nested 
with more than one process being active at the moment. Presented this way, (notice 
that we changed the terminology for a more appealing one) type-free A //// seems rather 
application oriented and one may believe that it has been designed by computer scientists 
or artificial intelligence researchers [18]. But this is not the case. Indeed Xpp has deep 
and interesting logic properties as it is an interpretation of classical propositional logic 
for which it offers a Curry-Howard correspondence, i.e,, a correspondence between 
propositions and types, proofs and terms, proof normalization and term reduction. 

* Partially supported by grant 1630 "Representation of proofs with applications, classification of 
structures and infinite combinatorics" (of the Ministry of Science. Technology, and Develop- 
ment of Serbia). 



S. Berardi, M. Coppo, and F. Damiani (Eds.): TYPES 2003, LNCS 3085, pp. 226-241, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 
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The Curry-Howard correspondence [14,20] is one of the main achievements in logic 
in computer science over the last years. Originally it has been introduced to show the 
connection between intuitionistic logic and lambda calculus, but several attempts have 
been made since to include classical logic. A natural deduction approach, A/z-calculus, 
was proposed by Parigot [15,16] while several sequent calculus proposals have been 
studied [3,1,8,11]. We have chosen the calculus A pp of Curien and Herbelin because it 
has interesting features. In particular, its connection with cut elimination [10] is some- 
what direct and its link with process calculi and, why not, with object oriented languages 
seems promising. 

Both Xp and A/i/I are proven to be strongly normalizing [16,12]. Still, there are 
strongly normalizing terms not typeable in these systems. In this paper we are interested 
in computations that terminate, more precisely we study the characterization of strongly 
normalizing type-free A/t/i-terms by intersection types. In such a system typeable terms 
are exactly the strongly normalizing terms. This characterization was done first for 
lambda calculus by Coppo, Dezani, Venneri, Pottinger and Salle in [5,6,7,17,19]. No 
attempt has been made so far to extend it to classical logic interpretation. One of the 
main features of our type system is the exclusive use of introduction rules (left and 
right) for the intersection. This feature is usual for connectors in sequent calculus, but 
it is nice to keep it also for intersection operators. The proof of typeability of strongly 
normalizing A/i/u-terms relies on a concept of perpetual strategy which has been already 
used in similar proofs for explicit substitutions [4,9] and which seems particularly well 
suited in such a system. The proof of strong normalization of typeable /I- free A///i-terms 
is based on a very simple and nice definition of reducible sets which is expected to have 
other applications. 

The paper is organized in the following way. Section 2 deals with the type-free 
Ap./1-calculus, the untyped syntax underlying A/r/I-calculus with particular focus on the 
newly defined perpetual strategies. In Section 3 we define an intersection type system 
Xpp n , which is an extension of the type system A pp of Curien and Herbelin. In Section 
4 we prove by means of perpetual strategies that all strongly normalizing A///i-terms are 
typeable in the new system Xpp n . In Section 5 we prove the strong normalization of 
p - free terms typeable in Xpp n employing the reducibility method. 



2 Sequent-Style Untyped A/r/r-Calculus 

2.1 Untyped Syntax 

We consider the type-free Xpp-calculus, the sequent style formulation of the untyped 
calculus underlying Xpp calculus proposed by Curien and Herbelin in [8,12]. We focus 
on the so called A/i/i-calculus, which is one of the A/ijI-calculi introduced in [8]. 

Type-free A pp has three syntactic categories (features), which we call CalleR, 
CalleE, and Capsules: 



CalleR r x \ Xx.r | pa.c 

CalleE e a | r • e | pxx 

Capsules c ::= (r || e) 
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The elements of CalleR, CalleE and Capsules are together referred to as A/f/i- 
terms. There are two kinds of variables in this system: the set Var r is made of Latin 
variables x, y,... which represent inputs, in particular they are bound by A-abstractions 
or /(-abstractions, and the set Var e is made of Greek variables a,f3,... which represent 
continuations and which can be bound by /i-abstractions. In a process interpretation, 
variables can be seen as communication channels, respectively input channels ( Var r ) 
and output channels (Var e ). The core of type-free A /i/I is made of capsules ( caller || 
callee) where caller and callee are two components supposed to communicate through 
a private channel. If the caller is of the form A x.r this means that its channel is active 
and waits for a value from the callee which is supposed to be ready to send its first item. 
If the caller is of the form /ta.c this means that c expects the callee to be its continuation. 
If the callee is of the form Jlx.c this means that c will take the caller to fill its hole named 



Type-free A/t/i has three reductions (evaluation rules) which make the previous in- 
terpretation more precise. 

(A) (A x.r || r' • e) — > (r[x 4— r'] || e) 

(/t) (/ta.c || e) — » c[a 4— e] 

(Jt) ( r || Jlx.c) — > c[x 4— r } 

Example. Let w be Xx. ya.(x || a;*a)andc?be (w || w •/?}. The term d corresponds 
to the term (Ax.xx)(Ax.xx) in A-calculus. We have 

d ~(X) II wot) || (3) 

— * {w II w • 0) = d 

(#*) 

but also 




Both reductions are infinite. 

For all A/i/I-term r, e, and c, we define two sets of free variables, namely Fv r (r), 
Fv e (r), Fv r (e), Fv e (e), Fv r (c) and Fv e (c) in the following way: 

~k CalleR 

Fv r (a:) = {a:} Fv e (a;) = 0 

Fv r (Ax.r) = Fv r (r) \ {a:} Fv e (Aa:.r) = Fv e (r) 

F v r (/ta.c) = Fv r (c) Fv e (/ia.c) = Fv e (c) \ {a} 

~k CalleE 

Fv r (o:) = 0 Fv e (a) = {a} 

Fv r (r • e) = Fv r (r) U Fv r (e) Fv e (r • e) = Fv e (r) U Fv e (e) 

Fv r (//x.c) = Fv r (c) \ {x} Fv e (Jlx.c) = Fv r (c) 
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* Capsules 

F v r ((r II e)) = Fv r (r) U Fv r (e) 

Fv e ((r jj e}) = F 'Ve(r) U Fv e (e) 

In this paper, we use Barendregt convention on variables, also called hygiene. It says 
that in a statement or an expression, there is no subexpression in which a variable is both 
free and bounded. 

From those reduction rules, one easily deduces that the normal forms are generated 
by the following abstract syntax. 

r n f ■■= x | A x.r n f \ pa.c nf 

&nf ■ ■ “ XT | X n f • C n f | HX.Cnf 

Cnf ■■= ( X II a) I {x II r nf • e n f) | (A x.r nf || a) 

In what follows we use the predicate nf to characterize the normal forms of type-free 
A pp, in other words, one has nf (M) if and only if M is a normal form. We use notations 
NF C , NF r and NF e for the three sets of normal forms in Capsules, CalleR and CalleE 
and SN C , SN r and SN e for the three sets of strongly normalizing terms in Capsules, 
CalleR and CalleE. 

Type-free A///I is not confluent due to the critical pair 

{lia.{y || /?) || px.{z || 7 }) 

which reduces to two different normal forms (y || /3) and (z || 7 }. 

2.2 Perpetual Strategies 

In this paper we deal with strong normalization. It is well known that reduction preserves 
strong normalization, namely 

M — y N => (, SN(M ) => SN(N)). 

Actually, we would like to get preservation of strong normalization by expansion. For 
that we define a specific strategy the so called perpetual strategy [ 2 ], which specifies 
some reductions, in other words, 

M N => M — y N 

and which preserves normalization by expansion namely, 

M~*N => ( SN(N ) SN(M)). 

It seems that this is better defined by contraposition. 

M N => (—iSN(M) => —<SN(N)). 

This means that if one reduces a non strongly normalizing term M to a term N by 
the perpetual strategy, then N is still non strongly normalizing. The perpetual strategy 
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perpc ( pa.c || Jix.c') = if c[a <— Jix.c'] £ SN C then c[a <— Jlx.c'] : 


: 7 : Capsules 


else c'[:r ■(— /xa.c] : 


: 7 : Capsules 


perpc {pa.c e) = (assume e ^ Jix.c 1 ) 




if a 6 Fv e (c) or nf(e) then c[a <— e] : 


: 7 : Capsules 


else ( pa.c || perpe e) : 


: 7 : Capsules 


perpc (r || Jlx.c) = (assume r ^ fia.c) 




if x £ Fv r (c) or nf(r) then c[x <— r] : 


: 7 : Capsules 


else (perpr r || Jix.c) : 


: 7 : Capsules 


perpc (A x.r || r' • e!) = if x e Fv r (r) or nf(r') then {r[x X— r'\ | e') : 


: 7 : Capsules 


else (A x.r || (perpr r') • e') : 


: 7 : Capsules 


perpc {y e) = (assume e ^ Jix.c 1 ) 




if nf(e) then unit : 


: nip : Unit 


else (y || perpe e) : 


: 7 : Capsules 


perpc (r /3) = (assume r ya.c) 




if nf(r) then unit : 


: nip : Unit 


else (perpr r || j3) : 


: 7 : Capsules 


perpr A x.r = if nf(r) then unit : 


: nip : Unit 


else Ar. (perpr r) : 


: p : CalleR 


perpr ya.c = if nf (c) then unit : 


: nip : Unit 


else ya. (perpc c) : 


: p : CalleR 


perpr x = unit : 


: nip : Unit 


perpe r • e = if nf(r) and nf(e) then unit : 


: nip : Unit 


if nf(r) and -mf(e) then r • (perpe e) : 


: e : CalleE 


if -mf(r) and -inf(e) then (perpr r) • e : 


: e : CalleE 


perpe Jix.c = if nf(c) then unit : 


: nip : Unit 


else /xx. (perpc c) : 


: e : CalleE 


perpe a = unit : 


: nip : Unit 



Fig. 1 . Definition of the functions perpc, perpr and perpe. 



we are going to define is deterministic which means when we write M N, then N 
is uniquely determined by M. Therefore, we can write N as (perp M). Moreover the 
perpetual strategy reaches normal forms of strongly normalizing terms, namely 

M A jV A nf (AT) M — » N A nf(iV) 

where A is the transitive closure of . This means that SN can be generated by induction 
as follows 

SN = NF U {M € Term | (perp M) € SN}. 

In this section, we define by mutual recursion three specific strategies for type-free 
A /xjtx in the sets Capsules, CalleR and CalleE, called perpc, perpr and perpe. In order 
to illustrate the kind of typing suggested by A/x/x (see Section 3) we define the functions 
perpc, perpr and perpe in that style. This means that each function takes a term and 
returns two kinds of results. Each result is labeled by an identifier written in Greek letters 
to keep the spirit of A/x/x. For instance, perpc returns either a value of type Unit labeled 
by nip or a value of type Capsules labeled by 7 . Actually (perpc c) returns the value 
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unit of type Unit when c is a normal form (the label mp reminds nf) and returns a value 
c! of type Capsules if the perpetual strategy applied on c yields c'. Hence if we write 
the type a la A/jp, we have 

perpc :: (c : Capsules I- i/<p : Unit, 7 : Capsules) 
perpr :: (r : CalleR h vip : Unit, p : CalleR) 
perpe :: (e : CalleE I- vip : Unit, e : CalleE) 

Note that we introduce the type a la by We define these three functions 
together in Figure 1 . 

Lemma 1 (Perpetuality). 

7k- (perpc c) e SN C => ce SN C 
7k- (perpr r) e SN r => re SN r 
7k- ( perpe e) e SN e =7 e e SN e 

Lemma 2. SN r , SN e and SN C are the least sets such that 
SN r = NF r U {r e CalleR \ (perpr r) e SN r } 

SN e = NF e U {e e CalleE \ (perpe e ) e SN e } 

SN C = NF C u {c e Capsules \ (perpe c) e SN C } 

3 The Type Assignment System 

Simple types corresponding to classical propositions are 

A,B ::= p \ A—> B 

A basic r-type assignment is an expression of the form x : A, whereas a basic e-type 
assignment is an expression of the form a : A. A set {a:i : A 1,. .. ,x n : A n } with 




Fig. 2. The type system \pp. 
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distinct r-variables is an r-context. A set {cti : Ai, , . . , a m : A m } with distinct e- 
variables is an e-context, r-contexts are denoted by r, whereas e-contexts are 

denoted by A, Ai, . . .. 

The typing system A nfl of Curien and Herbelin is based on three typing judgments 
corresponding to syntactic categories: 

c : (P h A) 
r \- r : A \ A 
r\e:A\- A. 

The first one is the type of a capsule, the second is the type of a caller and the third is 
the type of a callee. The rules of the type system A////' are given in Figure 2. 

Example. By the Curry-Howard correspondence simply typed lambda calculus cor- 
responds to intuitionistic logic. It is well-known that classical logic is obtained from intu- 
itionistic logic by adding Peirce’s law ((A B) —> A) —> A. According to this, Peirce’s 
law is not inhabited in simply typed lambda calculus, i.e., there is no lambda term of 
that type. The derivation in Figure 3 shows that there is a Xyy-term whose type is 
Peirce’s law. Here we show that Peirce’s law is inhabited in A/i/I, so that the A/i/7-term 
A x./j,p.(x || (Xy.y"f.(y || /3))*/3) is typeable by Peirce’s law. Let A be the typing tree for 
x : T | (Xy.y"f.(y || /?)) • (3 : T h /3 : A, where T denotes the formula (A — > B) — > A. 
A and the tree for typing Peirce’s law are given in Figure 3. 

Still in A/r/I one cannot type all normal forms, e.g., the A^tjl-term Xx./ia.(x || x • a) 
(seen in Section 2) which is a normal form and corresponds to the lambda term Xx.xx is 
not typeable in A/.i/i. For this reason we introduce intersection types and corresponding 
new type assignment rules. 




Fig. 3. The proof tree for Peirce's law 
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Intersection types are generated in the following way: 

A,B::=p \ B \ A fl B 

Intersection of contexts is defined in the following way: 

A n r 2 - {a; : A \ x : A £ A A x £ r 2 }U AiH A 2 = {a : A \ a : A £ A 1 Aa ^ A 2 } U 

{a; : A J x : A £ A A x £ A}U {a ■. A \ a : A £ A 2 Aa ^ Ai}U 

{x : An B \ X : A £ Ti A x : B £ r 2 } {a \ An B \ a \ A £ A 1 A a \ B £ A 2 }. 

The type system Xpp n is obtained by extending A pp with type assignment rules 
regarding intersection given in Figure 4. 

It is worth noticing that if c is a capsule then the last rule of a tree that types c is 
always (cut). It was shown by Hindley in [13] that intersection in lambda calculus does 
not behave as intuitionistic conjunction. In a similar way intersection in A pp does not 
behave as classical conjunction. 

Example. The above mentioned A/f/i-term pa.(x || x • a) which is not typeable 
in A/i/I is typeable in A/rjI n with the same type A fl (A — > B) as in lambda calculus with 
intersection types, i.e., pa.(x || x • a) : (A fl (A — > B)) -A B. 
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Still there are A^(/i-terms that are not typeable in Xpp n . The term (w || w»a), where 

w = Xx.pf3.(x || x • (3), simulates the term (Xx.xx)(Xx.xx) of the lambda calculus, as 

noticed above. It cannot be typed in Xpp n . Roughly speaking, we see that w and w • a 

should match their types, which leads to match a type, say C, with a type C —> D. 

Lemma 3 (Context expansion lemma). Let P C P' and A C A'. 

(i) If r b r : A \ A, then P' b r : A \ A'. 

(ii) If P | e : A b A, then P' \ e : A b A'. 

(iii) If c : (P b A), then c : ( P' b A'). 

Lemma 4 (Context restriction lemma). 

(i) If r b r : A | A, then P ( Fv r (r) b r : A \ A f Fv e (r). 

(ii) If r | e : A b A, then P \ Fv r (e) | e : A b A \ Fv e (e). 

(iii) Ifc : ( P b A), then c: (f | Fv r (c) b A ( Fv e (c)). 

Lemma 5 (Context intersection lemma). 

(i) If r b r : A \ A, then P n P' b r : A \ A. 

(ii) If r | e : A b A, then P n P' \ e : A b A. 

(iii) Ifc : ( r b A), then c : (f n f' b A). 

Lemma 6 (Typeability lemma). 

(i) If Xx.r is typeable, then T b Xx.r : P|.. ; .1, > II, | A, for some T, A, Ai , If . 

(ii) If r • e is typeable, then r | r • e : flie/ y ^ A, for some / A, Ai, Bi. 

(iii) If pa. c is typeable, then C b pa.c : rw Ai | A, for some r, A, A t . If in addition 

a fi Fv e (c) then r b pa.c : A \ A, for some r, A and any A. 

(iv) Ifpx.c is typeable, then / 1 | px.c : P| ig j Ai b A, for some / ’, A, Ai. If in addition 
x (f F v r (c) then r \ ftx.c : A b A, for some P, A and any A. 

Lemma 7 (Elimination lemma). 

(i) If r b Xx.r : Hie/ ^ Bi I ^ ien P, x : Ai b r : Bi \ A. 

(ii) If r | r • e : Hie/ ^ Bi b A, then P b r : Ai \ Ai and P | e : Bi b A^, for 
some i € {1, 2} and A\ n A 2 = A. 

Lemma 8. (i) Ifr'[x 3— r] is typeable, say P b r'[x <— r\ : A \ A, and x £ Fv r (r / ), 
then there exists B such that P b r : B | A and P, x : B b r' : A | A. 

(ii) If e[x <— ?’] is typeable, say P | e[x £- r] : A b A, and x £ Fv r (e), then there 

exists B such that P b r : B \ A and P, x : B | e : A b A. 

( iii ) Ifc [x i — r] : (P b A) and x £ Fv r (c), then there exists B such that P b r : B \ A 
and c : (P, x : B b A). 

Lemma 9. (i) Ifr [a £- e] is typeable, say P b r[a ■£- e] : A \ A, and a £ F v e (r), 
then there exists B such that P | e : B b A and P b r : A | a : B, A. 

(ii) If e'[a <— e] is typeable, say P | e'[a £- e] : A b A, and a £ Fv e (e / ), then there 

exists B such that P | e : B b A and P | e' : A b a : B, A. 

(iii) Ifc[a i — e] : (P b A) and a £ Fv e (c), then there exists B such that P | e : B b A 
and c : (P b a : B, A). 
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4 Typeability of Strongly Normalizing Terms 

Proposition 10 (Typeability of normal forms). Normal forms are typeable in A //// r . 

Proof. By induction on the structure of the normal forms c, r and e. 

Callers: if the normal form is x , \x.r n f or pa.c n f, then all three cases are straight- 
forward. Let us consider pa.c n f. By induction, one may suppose that c n f : (T ha: 
A, A), then by (p), T b pa.c n f : A | A. 

Callees: the proof is again straightforward for normal forms a or fix . c n f . Let us 
consider the normal form r n f • e n f, where r n j and e n j are normal forms. By the 
induction hypothesis there are contexts r. A, A , and Z\i and types A and B such that 
r b r n f : A | A and A | e„/ : B b A±. By Lemma 5 we get r n A b r n f : A \ A 
and F n Li e n f : B b A\. Now the application of (— >• L) leads to r n A | r n f • e n f : 
A^B \- An Ax. 

Capsules: the normal form is {x || a), (x || r n f • e n f) or (A x.r n f || a), where 
r n f and e n f are normal forms. All three cases follow the same pattern as above. Let 
us consider (x || r n f • e n f). By the induction hypothesis and by Lemma 7 there are 
contexts B and A such that r \ r n f • e n f : A B b A. One can distinguish two cases: 

1. If the variable £ is a free variable of the callee r n / • e n f , which means that it is 
declared in r, say r = A> x : C, then A> x : (A — » B) D C \ r n f • e n f : A — >■ B h A 
is obtained by (T\L r ). On the other hand B [ , x : (A— > B) Cl C \~ x : A^B \ A 
is obtained from r\,x : A— >■ B h x : A— >■ B \ A, again by (n L r ). Hence, we get 
(x || r nf • e n f) : (A, a; : (i-tB)nCh A). 

2. If the variable x is not a free variable of the callee r n f • e n f, then without lack of 

generality, by Context restriction lemma 4 we can suppose that x is not declared in / ', 
which means that ( x || r n f • e n f) : (r,x : A— > B \- A). □ 



Proposition 11 (Perpetual subject expansion). 

(i) If perpr r is typeable, then r is typeable. 

(ii) If perpe e is typeable, then e is typeable. 

(iii) If c € SN C and for all c' such that h(d) < h(c) ( where h(c) is the length of the 
longest reduction at c) f is typeable, then c is typeable. 



Proof. We prove the parts simultaneously by induction on the generation of perpr r, 
perpe e and h(c). 

Let us start with case (iii). In most of the cases the typability of perpe c is enough 
to conclude on the typability of c. The only exception is when c = ( pa.c " || fix.c') and 
perpe c = d'[a 3— px.c'] ^ SN C , which is avoided by the assumption c G SN C . 



c = ( pa.c ' || e) and perpe c = c'[a 3— e]. If a € Fv e (c / ) then by Lemma 9, there 
exists A such that r \ e : A b A and d : (L h a : A, A). By (p) r b pa.c' : A \ A, 
therefore (pa.c' || e) : (r b A). If a F v e (d), then nf(e) hence, e is typeable by 
Lemma 10 (say A | e : A b AA , and perpe c = d. Therefore, by Lemma 6 
pa.c' is typeable by A b pa.c' : A \ A 2 . By Context intersection lemma 5, we get 
A n A b pa.c' : A | A\ and A n A | e : A b hence the result follows by (cut) 
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c = (pa.d || e) : (Pi n P2 b A\ n A 2 ). Notice that this case cannot be deduced by the 
first rule of Figure 1 . 



c = ( fia.c ' || e) and perpc c = ( pa.c ' || perpee). Then a ^ Fv e (c / ). Since by in- 



duction perpc c is typeable, we can assert that pa.c' and perpe e are typeable. By induc- 
tion e is typeable, say by P2 | e : A b A2 and because a Fv e (c') Pi b pa.d : A | A\. 
We have then Pi n P2 b pa.c' : A \ A\ and Pi n P2 | e : A b A 2 . Therefore 
c = (/Lta.c' || e) : (Pi n P 2 b A\ n A 2 ). 



c = (r || px.c') and perpc c = d[x <— r\. 



- Case x £ F v r c' works as the previous similar case with perpc c = d[a <— e\ and 
a € Fv e (c / ). 

- Case x Fv r (c / ) and nf(r) works as the similar case with perpc c = d[a £- e] 
and a (j F v e (c') and nf(e). 

- Case c = ( \xrx.d ' || Jix.d), then d'[a ■£- Jix.d] £ SN C . By induction it is typable 
hence, c" is typable by c" : (Pi b a : A, Ai). By (nf? e ) we get c" : (Pi b a : 
iflB, Ad) so pa.c" is typable by Pi b pa.c" : A fl B \ A\. On the other hand, 
d[x <— pa.d] is typable by d : (P 2 , x : B b Z\ 2 ) hence d : (P 2 , x : A fl B b Z\ 2 ). 
Therefore, Jix.d is typable by P 2 | Jix.d : A fl B b A 2 . Hence c = (r || Jix.d ) : 

{r 1 nr 2 ,x: Bd AxnA-i). 




(P b A). Therefore the last rule that types perpc c is (cut). This means that there exists 
an A such that P b r[x ■£- r'\ : A \ Ai and P | d : A b A 2 . 



* If x £ Fv r (r), by Lemma 8 there exists B such that r,x : B \- r : A \ Ai 
and P b d : B \ A\. Therefore P b A x.r : B— > A \ A\. By (— >L), one gets 
P | d • d : B— > A b A\ n Z\ 2 - By (cut), (A x.r || d • d) : (P b A\ n A 2 ). 

* If x ^ Fv r (r) and nf(r'), then r[x ■£- r'\ = r. By assumption P b r : A \ A\. From 
nf(r') by Proposition 10 follows that?’' is typeable, i.e., P' b d : /i A'. By Context 
intersection lemma and Context expansion lemma, one gets /’nl’bc: /i b r : A \ 
A\, T V) T' \- d : B \ A' and P n P' | d : A b A 2 . Collecting all these judgments 
through appropriate rules, one gets (A x.r || d • d) : (P fl P' b Z\i fl Z \2 n A'). 



c = (A x.r || d • d) and x ^ Fv r (r) and ^nf(r'). By assumption perpc c = (A x.r || 



perpr( d) • d) : (P b 4). By Typeability lemma there exist A and B such that 
P b A x.r : A—t B \ A\ andP | perpr (d)»d : A—t B \- A 2 . Moreover, by Lemma 7 
r,x : A b r : B \ A -\ , P b perpr(r') : A \ Z\' 2 andP b d : B \ Ad{. By induction, there 
exist P', A' and C such that / v b d : C A' . According to Context intersection lemma 
and (—* L) r nr' | d • d : C — > B b A' n A'). Since x ^ Fv r (r) then P b r : B \ Z\i 
by Context restriction lemma. Therefore, by Context expansion and Context intersection 
lemma P n Pb x : C \~ r : B \ Ai, hence P n P' b A x.r : C — > B \ A\. Therefore by 
(cut) (A x.r || d • e') : P n P' b Z\i n A " n A'. 
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c = (x || e) by assumption (a; || perpe e) is typeable and by induction e is typeable, say 
r | e : A b A. On the other hand x : A h x : A |. Therefore, r n {x : A} \ e : A b A 
and r n {x : A} h x : A \ A hence, (x || e) : (f n {i : i} h A). 

Cases perpr r and perpe e are easy and made on the same scheme as perpe c. □ 



Proposition 12. Strongly normalizing Xpp-terms are typeable in Xpp n . 

Proof. Lemma 2 characterizes strongly normalizing terms as finitely reachable from 
normal forms by “perpetual expansion”. Therefore the result comes from Lemma 10 on 
typeablity of normal forms and Lemma 1 1 on the perpetual subjects expansion. □ 



5 Strong Normalization of Typeable Terms 

In this section, we consider terms that are free of any occurrence of ji (p - free terms). 
This way we avoid the difficult problem of having the critical pair between the rules (p) 
and ( p ) and consider a confluent subsystem of Xpp. 

Let S C Capsules, 7 Z C CalleR and £ C CalleE. Then we say that S is (TZ,£)- 
.vatu rated if the following holds: 

1. (Ve G £){x || e) G 5; 

2. (V/, Si G lZ)(r[x 3— r') || e) G S => (A x.r || r' • e) G S; 

3. (Ve, ej G £)c[a G e] G 5 4 ( pa.c || e) G 5; 

Lemma 13. SN C is ( SN r , SN e )-saturated. 

Proof. 1. Straightforward, since (x || e) G SN C whenever e G SN e . 

2. If (r\x 3— r’\ || e) G SN C , then in order to conclude (A x.r || r' • e) G SN C the 
only problem could rise when x Fv r (r), but this is avoided by the assumption 

r' G SN r . 

3. Similarly. The only problem that could appear here is if e = px.d . In this case we 

could not prove that ( pa.c || px.c 1 ) G S , since there could be an infinite reduction 
starting with f[x G- pa.c}. Nevertheless this cannot happen, since we consider pi- 
free terms. □ 

We define two type interpretation: 

7k- e-interpretation ML - Cal lees: 

1 . Me = SN e -, 

2. lBnC} e = [B] c n[C] e ; 

3. \B C ] e = Var e U {r • e | r G [B],. and e G |C] e }. 

7k- r-interpretation |[r4] r - Callers: 

1. lp} r = SN r ; 

2. r G [Al] r if and only if (r || e) G SN C for all e G lAJ e . 



Lemma 14. [A] r C SN r , \A\ e C SN e and Var r C | Aj r . 
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Proposition 15 (Adequacy). Let r, e and c be y-free terms. 
Callers: 



• Ai , r 


h r 


: A 


1 : 


Bj, A => 


(Vrj 


G 


[Aj] r )(Vej 


G 


\Bj\e) r[Xi <- 


n][o} 


e L 


] C [bl] r . 












Callees: 


















•&i • A-i , r 


L e J 




b d} : 


, A => 


(Vrj 


G 


|Aj] r )(Vej 


G 


1 e) €\pd ^ 


n][dj 


bi. 


] € [A]e- 












Capsules: 


















c : (xl : 


Aj,r 


b , 


“5 : B j, 


A) =* 


(Vrj 


G 


[Aj] r )(Ve i 


G 


\Bj 1 g) c\Xi ^ 




e T 


| G SiV c . 













Proof By induction on the derivation in A/i/j n . 

Callers : 

Case r is a variable. Then r = Xi or r = y, y : C G F. The case r = Xi is 
straightforward since r[xj G- ry] [aj G- ej] = ry £ [Aj] r . If r = y and y : C G T, then 
r[xi G- rl ] [aj G- ej] = y and y G [C] r by Lemma 14.3. 

Case r = A x.s, the last applied rule is (— >• R). Then xj : Aj, T b A x.s : C — ► D \ 
dj : Bj, A is obtained from ij : Aj, £ : C , I ' b s : D \ aj : Bj, A. First of all, we 
conclude by induction that s[ij G- r,] [dj 3— ej] G [i?] r , hence s[ij G- rj][dj G- ej] G 
SN r and also \x.s\xj G- rj][dj G- ej] G SN r . We have two subcases to consider. 

1. For every a, obviously (Ax.s[ij G- rj][dj G- ej] || a) G SN r . 

2. By the induction hypothesis, for every rj G [Aj] r , r’ G |<7] r and ej G [.Bj] e , 
one has s[xj G- rj][x 3— r'][dj G- ej] G [£)] r . By the definition of [Z?] r one gets 

(s[ij G- ij][a; g- r'][dj G- ej] || e) G SN C 

for all e G \D\ e . The set SN C is ( SN r , SW^-saturated as shown in Lemma 13, hence 
(Ax.s[ij 3— ij][dy- g- ej] || r' • e) G SN C . 

by case 2. of the definition of saturated sets. It is easy to see that r' • e G \C — > D\ e 
since r' G 1(7],. and e G [D] e . Also we can notice that in this way we obtained all 
callees in \C — > D\ e which are of the shape sue i. 

The previous two subcases prove that for all e G [<7 — > D\ e , one has 
(\x.s[x\ G- rj ][d} 3— ej] || e) G SN C and we conclude that 

(Ax.s)[xj G- Ti][aj G- ef G |C — > D] r . 

Case r = /xa.c, the last rule applied is (p). From c : (x j : A i; C b a : A, dj : 
Bj , A), by the induction hypothesis 

c[di G- fj ] [a G- e] [d} G- e}] G <S7V C 

for all rj G [Aj] r , e G JA] e and e,- G \Bj\ e . 

Since SN C is ( SN r , SW e )-saturated (case 3. of the definition) one has 
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Let us notice here that /ia.c [ay 3— ry] [dy -t— ey] G SN r . By definition of |A] r we obtain 

pa.c[xi ry][d} 3- ey] £ [A] r , 

thus (/aa.c)[ay 3— fy][dy 3— ef] G [A] r . 

The cases (flL r ), (fli? r ) and (ni? e ) are easy to prove. 

Callees: 

Case e is a variable. Then e = ay or e = a, a : C G A The case e = ay is 
straightforward since e[ay 3— ry] [ay 3— e]\ = ey G [.Byfl e . If e = a and a : C £ A, 
then e[ay 3— fy][dy G- ey] = a. Moreover a G [C] e holds by definition of type e- 
interpretation. 

Case e = r' • e' , the last rule applied is (— > L) . Then ay : A l . T \ r' • e! : (C D) h 
dy : Bj,A is obtained from ay : Aj,.T h r' : C \ dj : Bj, A and ay : Aj,T \ 
e' : D \- a j : Bj,A. By the induction hypothesis r'[ay 3— fy][dy 3— ey] G [C] r 
and e'[ay 3— ry][dy 3— ey] G [D] e for all ry G \Aj\ r and ey G [Byje. According 
to the definition of [C — D\ e we get r'[ay 3— ry] [dy <— ey ] • e' [ay <— ry] [dj 3— ey] G 
\C — > D\ e which leads to (r' • e') [ay 3— ry ] [dj 3— ey] G [C — >■ D\ e . 

The cases (fl L r ), (fl L e ) and (flf? e ) are easy to prove. 

Capsules'. 

Let c = (r || e) : (ay : Ay, .T h dy : Bj, A). The last rule applied in typing a capsule 
is (cut) which means that there exist D, D',r and e such that ay : Aj, J 1 h r : D \ dj : 
Bj , A and ay : Aj,T \ e : D' h dy : Bj , Aandc = (r || e). By the induction hypothesis 
r[xi 3- fy] [dj £- ey] G [L>] r and e[xi 3- ry][dy 3- e"y] G \D\ e for all ry G [A, ; ] r and 
ey G H-By] e- According to the definition of [_D] r we obtain (r[ay 3— iy][dy G- ey] | 
e[ay 3— iy][dy f— ey]) G SN C . Therefore c[ay 3— ry][dy 3— ey] G SN C . □ 

Proposition 16. p-free terms typeable in A pp n are strongly normalizing. 

Proof. By taking [ay 3— ay] and [dj <— dj] one gets: r\-r\ A\A=>r£ [A] r , 
r\e:APA=>e£ [A] e and c : (r \- A) => c £ SN C . Combined with Lemma 14 
this gives the result. □ 

According to Proposition 12 and Proposition 16 we get the equivalence of typeable 
and strongly normalizing ft- free terms. 

Corollary 17. p-free terms are typeable in \pp n if and only if they are strongly nor- 
malizing. 



6 Conclusion 

Our method based on a general concept of saturated sets and type interpretation should 
be easily extended to other kinds of normalization. It can also be used to study filter 
types for this calculus. 

The intersection type system A///7 r introduced in this paper completely characterizes 
all strongly normalizing p-free terms. Nevertheless, the method presented in Section 5 
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fails to prove the strong normalization of all A/x/x-terms typeable in A/x/x n . Therefore, 
an open problem that lurks behind this paper is to find a proof-technique for proving the 
strong normalization of all terms typeable in A/x/x n . 

By considering strongly normalizing terms that go beyond those representing clas- 
sical proofs, we have opened a door to a new process calculus language which has to be 
explored from the pragmatic as well as from the semantic point of view. 
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Abstract. We study the connections between graph models and “ wave- 
style ” Geometry of Interaction (Gol) A-models. The latters arise when 
Abramsky’s Gol axiomatization, which generalizes Girard’s original Gol, 
is applied to a traced monoidal category with the categorical product as 
tensor, using a countable power as the traced strong monoidal functor !. 
Abramsky hinted that the category Rel of sets and relations is the basic 
setting for traditional denotational “static semantics”. However, the 
category Rel together with the cartesian product apparently escapes 
original Abramsky’s axiomatization. Here we show that, by moving to 
the category Rel* of pointed sets and relations preserving the distin- 
guished point, and by sligthly relaxing Abramsky’s Gol axiomatization, 
we can recover a large class of graph-like models as wave models. 
Furthermore, we show that the class of untyped A-theories induced by 
wave-style Gol models is richer than that induced by game models. 

Keywords: (linear) graph model, traced monoidal category, weak 
linear category, categorical geometry of interaction. 



Introduction 

Geometry of Interaction and game models have been the most relevant novelties 
in the last decade in the field of semantic analysis of proof theory and functional 
languages. 

In [1], Abramsky provides a categorical axiomatization/generalization of Gi- 
rard’s Geometry of Interaction (Gol) [18], embracing previous axiomatic ap- 
proaches, such as that based on dynamic algebras [16,17] and the one in [4]. 
This generalization is based on traced monoidal categories, [23], and it consists 
in building a compact closed category G{C) ( Gol category) from a traced symmet- 
ric monoidal category C. In [2,3], the construction is extended to exponentials, 
which, in a general categorical setting, are captured by a strong monoidal func- 
tor ! on the traced category C, together with some additional structure. Under 
these conditions on C, the Gol category Q(C) is a weak linear category (WLC), 

* Research supported by the MIUR Projects COFIN 2001013518 Cometa and 
20022018192_002 Protocollo, and by the UE Project IST-2000-29001 Types. 



S. Berardi, M. Coppo, and F. Damiani (Eds.): TYPES 2003, LNCS 3085, pp. 242—258, 2004. 
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i.e. a weakening of a linear category (see [13]). Moreover, every reflexive object 
in a WLC gives rise to a linear combinatory algebra (LCA). 

Following [1,2], there are two main instantiations of the Gol axiomatization. 
In the “particle-style” Gol, the tensor on the underlying category is a coproduct 
and the strong monoidal functor is a countable copower. Girard’s Gol is an 
instance of this. Composition in the Gol category can be intuitively understood 
by simulating the flow of a particle around a network. Dually, in the “wave- 
style” Gol, the tensor is a product and the strong monoidal functor amounts to 
a countable power. Composition in the Gol category is defined now statically and 
globally. In [1], the category (Rel,+) was suggested as the “basic setting” for 
particle/ dynamic semantics , while (Rel, x) as the “basic setting” for wave/ static 
semantics. This is clearly the case of the former, wiclr underlies many game 
categories in the style of [5], and contains as subalgebras those fruitfully used in 
[6,8,7]. On the other hand, the thesis that (Rel, x) is the basic setting for static 
semantics is less immediate, also because (Rel, x) itself apparently escapes the 
original Gol axiomatization of [3]. 

The connections between traditional (static) semantics and wave Gol have 
not received much attention in the literature, apart from the investigations of 
some special wave-style models in [10,9]. In the present paper and in [20], this 
connection has been taken seriously, and categories of relations in the wave-style 
have been explored, vis-a-vis graph models. 

In this paper, we show that, by moving to the category Rel* of pointed sets 
and relations preserving the distinguished point, and by slightly relaxing the Gol 
axiomatization of [3], we can recover many familiar graph models of A-calculus. 
In particular, we show that Gol algebras on (Rel* , x*) are essentially graph-like 
models in the sense of Scott-Plotkin-Engeler, [26,24]. Moreover, we show that 
the A-tlreories modeled in this setting do go beyond sensible and semi-sensible 
theories. This should be contrasted with the fact that, in [14], it has been shown 
that game models, i.e. particle-style Gol models, capture only a very limited 
number of A-theories, related to Bolrm trees and Levy-Longo trees. 

The paper consists of three parts. 

In the first part (Section 1), we study standard graph models (GMs) and 
linear graph models (LGMs) from a purely set-theoretical viewpoint. The latter 
have been introduced by Abramsky, and they are special cases of LCAs, provid- 
ing combinatory models for the linear A-calculus. A natural, but somewhat not 
immediate fact that we prove is that every GM can be recovered from a LGM 
via standardisation. An important consequence is that all A-theories induced by 
GM’s can be captured by LGMs. In the literature many variants of the original 
notion of GM have been studied (see e.g. [26,24,15]). Here we provide purely 
set-theoretical presentations of some variants of the original notions of (L)GM, 
which, as we will see, arise as wave Gol models. 

In the second part of the paper (Section 2), we study LGMs from a categorical 
point of view. In particular, we show that many (pointed) LGMs can be captured 
as special WLCs in the wave-style. Somewhat surprisingly, the original Scott- 
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Plotkin-Engeler graph model escapes this categorical description, because of the 
behaviour of the empty relation. 

In the third part of the paper (Sections 3 and 4), we study wave Gol algebras. 
In particular, we introduce a weaker axiomatization of the Gol situation of [3], 
which still ensures that a Gol category is a WLC, and hence it allows to build 
a Gol algebra. This weaker axiomatization allows us to capture the case of 
(Rel*, x*). We show that the class of Gol algebras induced by (. Rel *, x*) are 
pointed LGMs. Finally, we show that there are wave Gol models realizing non- 
sensible A-theories. 

Notation. Let V , Vf, Vf n e denote the (finite, finite non-empty) powerset. Let C, C f, 
Czfne denote (finite, finite non-empty) subset inclusion. Let U, V be objects in a category 
C. We denote by r : U <1 V : t' a retraction of U in V, i.e. t' o r = idu ■ Let Pfn 
be the category of sets and partial functions. Let Rel be the category of sets and 
relations. Relations / C A x B will be denoted by / : A — e-> B. Let Ret be the 
category of pointed sets (A,*a) and relations which preserve the distinguished point, 
i.e. / : (A,*a) (B,*b) iff (*a,*b) £ /• The sets of finite streams and the set of 

infinite streams on a set A will be denoted by A <ul and A? , respectively. Streams in 
AA will be denoted by o, b, . . . The i-th component of a (finite) stream a is denoted 
by a-i\ <£° denotes the infinite stream whose components are all equal to a. 



1 Linear Combinatory Algebras and Graph Models 

In this section, first we recall basic facts concerning linear combinatory algebras 
(LCAs) and we introduce the new notion of linear combinatory X-model. Then 
we focus on a special class of (linear) combinatory A-models, i.e. (linear) graph 
models ((L)GMs), and we show that every graph model is induced by a linear 
one. Finally, we present some variants of the original notion of (L)GM, which will 
be of interest in the sequel, namely Rel (L)GMs, pointed (L)GMs, stream-based 
(L)GMs, together with their generalizations w.r.t. cardinality. 



1.1 Linear Combinatory Algebras, A- Models and Graph Models 



The notion of linear combinatory algebra refines the notion of combinatory alge- 
bra, in that it has an extra unary operation ! and a set of combinators, refining 
Curry’s original set of combinators: 



Definition 1 (Linear Combinatory Algebra). A linear combinatory algebra 
(LCA) A = (A,-,!) is an applicative structure (A,-) with a unary (injective) 
operation !, and distinguished elements (combinators) B,C,RK,W,D,5,F sati- 
sfying the following equations (we associate ■ to the left and we assume ! to have 
order of precedence greater than ■): for all x,y,z £ A, 

Bxyz = x(yz) Wx!y = x!y!y 

Cxyz = ( xz)y D!x = x 

lx = x 6!x = !!x 



Kx!y = x 



F!x!y = !(xy) 
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The notion of LCA corresponds to a Hilbert style axiomatization of the 
{!, — o}-fragment of linear logic. Every LCA induces a standard combinatory 
algebra (CA), by the combinatory version of Girard’s translation of Intuitionistic 
Linear Logic into Intuitionistic Logic (see [3] for more details), i.e.: 

Proposition 1 (Standardisation). Let (A, /) be a LCA. Then (A, ■ s ), where 

x - s y = x ■ !y, is a CA with combinators B S ,C S ,I S , K s , W s defined by: 

C s = D' ■ C I S = D’ I 

B s = C ■ (B ■ (B ■ B ■ B) ■ (D 1 ■ I)) ■ (C ■ ((B ■ B) ■ F) ■ 8) 

K s =D' K W S = D'-W , 

where D ' = C(BBI)(BDI) is such that, for all x,y, D’x\y = xy. 

It is well known that A-models a la Hindley-Longo can be characterized as 
combinatory X-models (see e.g. [11]): 

Definition 2 (Combinatory A-model). A CA A = (A,-) is a combinatory 
A-model if there exists an extra selector combinator e such that, for all x,y £ A, 

exy = xy and (Vz. xz = yz) => ex = ey . 

Here we introduce the linear version of the notion of combinatory A-model: 

Definition 3 (Combinatory Linear A-model). A LCA A = (A,-,/) is a 
combinatory linear A-model if there exist linear selector e and selector combi- 
nator e s such that, for all x,y £ A, 

exy = xy and (Vz. xz = yz) => ex = ey 
e s !x!y = x!y and (Vz. x!z = y!z) => e s !x = e s !y . 



Then we have: 

Proposition 2. Every combinatory linear X-model gives rise by standardisation 
to a combinatory X-model. 

Graph models (GMs) a la Scott-Plotkin-Engeler and Abramsky’s linear graph 
models (LGMs) are examples of combinatory (linear) A-models: 

Definition 4 ((Linear) Graph Model). A graph model (GM) U is an ap- 
plicative structure (V(U),- T ), where U is a (infinite) set with a retraction in 
Pfn 1 t : Vf(U) xU <\ U, and the application * T is defined by: for all x, y £ V(U), 

x - T y = {a | 3w Q/y. t(w, a) £ x} . 

A linear graph model (LGM) is a structure U = (V(U),- Tl , ! T2 ), where U is a 
(infinite) set with retractions in Pfn n : U x U <1 U and T 2 : Vf{U) <1 U , and 
linear application - Tl and ! T2 are defined by: for all x,y £ V(U), 
x - Tl y = {a \ 3b £ y. Ti(b,a) £ x} , 

! T2 x = ( t 2 ( u >) | w Cfx} . 



1 I.e. an injection. 
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One can define combinators on (linear) graph models in such a way that: 
Proposition 3. Every (L)GM is a combinatory (linear) X-model. 

As one expects, given a LGM, by standardisation, we get a GM: 

Proposition 4. Let U = / T2 ) be a LGM with retractions n : UxU <3 

U and T 2 : Vf(U) <\ U . Then by standardisation we get a GM S(U) = ( V(U ), • T ), 
where the retraction r : Vf(U) x U <3 U is defined by T\ o (t^ x idjj). 

An interesting fact, which has a simple but non trivial proof, is that there 
exists a (non trivial) dual construction for building a LGM from a GM, for which 
standardisation is an inverse: 

Proposition 5. For any GM U : there is a LGM C{U) such that S(C(U)) = U. 
Proof. Let U = (' P(U ),- T ) be a GM. For any Injection ( : U — > Vf(U) (which exists 
by cardinality reasons), the retractions ti : U x U <1 U, t\ = t o (£ x idu), and 
r 2 :P f {U) <\U,T 2 = C 1 induce a LGM (P(U), - T1 , b 2 ). We take such LGM as C{U). 

□ 

Remarkably, the construction in the proof of Proposition 5 holds for any 
choice of the bijection £. From the point of view of A-theories, we have the 
following important consequence: 

Corollary 1. The class of X-theories induced by GMs coincides with the class 
of X-theories induced by GMs obtained from LGMs via standardisation. 

1.2 Rel (Linear) Graph Models 

We introduce a class of generalized (L)GMs, where retractions are allowed to be 
relations instead of functions. The constructions in [26] and some weak variants 
of filter models [12], can be viewed as instances of Rel graph models. 

Definition 5 ( Rel (Linear) Graph Model). A Rel graph model is an ap- 
plicative structure (V(U),- T ), where U is a (infinite) set with a retraction in Rel 
t : Vf(U ) xU<U, and the application - T is defined by: for all x,y G V(U), 
x - T y = {a | 3w CfyBc £ x. (( w , a), c) £ r} . 

A Rel linear graph model is a structure U = (P(f7), - Tl , / T2 ) where U is a 
(infinite) set with retractions in Rel r± : U x U <1 U and T 2 : Vf(U ) <1 U, and 
linear application - Tl and ! T2 are defined by: for all x,y £ V(U), 
x - Tl y — {a \ 3b G y.3c £ x. ((&, a), c) £ iq} 

! T2 x = {a | QfX. ( w,a ) £ 72 } • 

Proposition 6. Rel (L)GMs are combinatory (linear) X-models. 

Proof. We just show how to define selectors. Linear selector: e = {d \ 3 a, b,c £ 
U. (((c,c),d) £ ri A (( a,b),c ) £ n)}. Standard selector: e s = {b \ 3w C « Cj 
U. 3a,c,c' € U. ((({c},c'),b) € r A (( 10 , a),c)£r A ((v, a), c') £ r)}. □ 



One can check that Propositions 4-5 extend to Rel (L)GMs. 
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1.3 Pointed Rel (Linear) Graph Models 

Pointed Rel graph models arise when we carry out the graph model construction 
in Rel* . Namely, we fix a special point * £ U, and we take as carrier the pointed 
powerset (V*U, {*}), i.e. the set of all subsets u of U such that * £ U, together 
with point-preserving codings. 

Definition 6 (Pointed Rel (Linear) Graph Model). Let U be a (infinite) 
set with a special point * £ U. 

A pointed Rel graph model (pointed graph model, for short) is an applicative 
structure (V*(U),- T *), where t* : V)(U) x U <1 U is a retraction in Rel* i.e. 
(({*},*),*) £ t* , V*f{U) denotes the set of all finite pointed subsets of U , and 
the application - T * is defined by: for all x,y £ V*(U), 

x- T * y = {a | 3 w C fne y. 3c € x. ((w, a),c) £ r*} . 

A pointed Rel linear graph model (pointed linear graph model, for short) is a 
structure U = ((P*(U),- T *,! T *) where r* : U x U < U, r| : V)(U) <1 U are 
retractions in Rel*, i.e. ((*,*),*) £ t( and ({*},*) £ and linear application 
• T * and ! t • are defined by: for all x,y £ V*(U), 

x - T * y = {a\3b £ y. 3c £ x. ((b,a),c) Sr*} , 

■'t*x = {a | 3 W Cfne X. (w, a) £ 7-2 } . 



Proposition 7. Every pointed (L)GM is a combinatory (linear) X-model. 

Similar results to those in Propositions 4-5 hold then for pointed (L)GMs. 

Remark 1. Notice that, in the spirit of [25], the above constructions of (L)GMs go 
through even if we consider V< K , for any regular cardinal k, in place of Vf (i.e. V<J). 
We call k-(L)GMs the corresponding (L)GMs induced by such codings. 



1.4 Stream-Based (Linear) Graph Models 

Interesting variants of graph models are obtained by considering codings on 
(possibly finite) streams, in place of codings on the powerset. As we will see, 
these can be actually viewed as special cases of graph models as defined in the 
previous sections. The interest of stream-based graph models lies in the fact that, 
as we will see, they arise in the categorical context of weak linear categories and 
of Geometry of Interaction. Both the standard, Rel and pointed graph models 
considered previously have corresponding stream based variants. Here we give 
the details only in the case of pointed graph models with finite streams. Similarly 
one can define (pointed) graph models with streams with finite codomain, or 
general streams. 
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Definition 7 (Finite-stream Graph Models). Let U be a (infinite) set with 
a special point * £ U. 

A finite-stream graph model is an applicative structure (V*{U), >»), where t* : 
U* <u x U <\ U, U* <u is the set of finite streams with at least one occurrence of 
*, and t* is a retraction in Rel* i.e. (((*),*),*) £ t* , and the application - T * is 
defined by: for all x,y £ V*(U), 

x - T * y = (o | 3 u. 3c £ x. (Vi. Ui £ y A ((n, a), c) £ r*)} . 

A finite-stream linear graph model is a structure U = (P*(U), - r *, / T *) where 
t* : U x U < U, : U* <ul <1 U are retractions in Rel*, i.e. ((*,*),*) £ r( 

and ((*),*) £ and linear application - T * and ! T * are defined by: for all x,y £ 
T*{U), 



x - T * y = {a\3b £ y. 3c £ x. (( b,a),c ) Sr*} , 



! t *x = {a | 3 u. ((Vi. Ui £ x A (u,a) £ r^)} . 



Proposition 8. Every finite- stream (L)GM is a combinatory (linear) X-model. 

Similar results to those in Propositions 4-5 hold also for finite-stream (L)GMs. 

The connection between stream-based (L)GMs and powerset-based (L)GMs 
is given by the following 

Theorem 1. Every finite- stream (L)GM is isomorphic to a pointed (L)GM. 

Proof. We give the proof for the linear case. Let (fP*(U), - T » , \ T * ) be a finite-stream 
linear graph model. Now we take £ : VJU to be the injective relation defined 

by (v, a) £ £ iff, for all i, a; £ v and for all b £ v there exists i s.t. at = b. Then 
&2 '■ ~PfU -&+TJ, #2 — t- 2 * o C is s.t. lg*x = {a \ v C /ne x. (v,a) £ 0|} = \ T *x, i.e. the 
finite-stream LGM ( V*(U ), - T », !#») is a pointed LGM with codings t(, 9 1. Notice that 
the coding # 2 in the proof above is forced to be non functional. □ 

However, notice that the converse of the above theorem fails. 

2 Weak Linear Categories and Linear Graph Models 

In this section, we discuss Abramsky’s axiomatic construction of an LCA from a 
weak linear category (WLC). WLCs are the counterpart for linear combinatory 
algebras of the notion of linear category for linear A-models (see [13]). In partic- 
ular, we show that the category Rel* with tensor the cartesian product, together 
with suitable stream-based functors turns out to be a WLC. Moreover, the LCAs 
arising from the WLC Rel* capture exactly the classes of stream-based (L)GMs. 

We start by recalling Abramsky’s notion of WLC and the construction of an 
LCA from a WLC (for basic categorical definitions see Appendix A). 
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Definition 8 (Weak Linear Category). A weak linear category (WLC) 

(CM, 

!) consists of: 

— a symmetric monoidal closed category (CM); 

— a symmetric monoidal functor ! : C — > C; 

— the following four monoidal pointwise natural transformations: 

der : !=> Id (dereliction) 

6 : /=> !! (comultiplication) 

con : ! => ! <g> ! (contraction) 
weak : ! => K-i (weakening) , 
where /C/ is the constant I functor. 



Definition 9. A reflexive object in a WLC C is an object V in C with the fol- 
lowing retracts: 



V-oV < V !V <V I < V . 

Theorem 2 ([3]). Let (C, <%), !) be a WLC and V be a reflexive object in C with 
retracts 9\ : V— oV <1 V : 9( and 9% : !V < V : 9' 2 - Then (C(I, V), •, !) is a LCA, 
where ■ and ! are defined by: for f,g€ C(/, V), 

f ■ 9 - ev o ((9( of)®g)o (jfj !f = 9 2 o (//) o (j/j , 

where (j>'j : I — > !I is the isomorphism associated to the strong monoidal functor 
! (see Appendix A). 

The category Rel* is symmetric monoidal closed w.r.t. the product x* inher- 
ited from Rel. Moreover, (Rel* , x*) together with any of the following symmetric 
monoidal functors based on streams is a WLC: 

Definition 10. i) Let ( )* <w : Rel* — > Rel* be the functor defined by: 

— for any pointed set (A, *a), let ( A , *a) <uj — (A* <ul , (*a)), where A* <w is the 
set of finite streams with at least one occurrence of *a! 

-for any f : (A,* a) -©->• (-B, *b). let. f* <u} : (A* <1jJ , (* a )) -©-)> (B* <w , (* B )) be 
defined by ( a,b ) £ f* <u iff |a| = |b| and Vi. (a.i,bi) £ /. 

ii) Let ( )*“ : Rel* —$■ Rel* be the functor defined by: 

— for any pointed set (A, *a), (A,* a)™ — ( A’ fU} ,*A) where A* u is the set of 
(infinite) streams with at least one occurrence of * a; 

— for any f : (A,* a ) (B,* b ), let f* u : (A*“,(*a)) (B* uj ,(* b )) be 

defined by (a,b) £ f* u iff Mi. (aubf) £ /. 

in) Let ( )) u : Rel* —> Rel* be the functor defined by: 

— for any pointed set (A,* a), (A, *a)/ w — (A^,*^), where A*^ is the restric- 
tion of A* u to the streams with finite codomain 2 ; 

— the definition on arrows is similar to (ii). 

I.e. the set [N — Yfcod, X] of functions from N in X with finite codomain. 



2 
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Proposition 9. (. Rel *, x*) with one of the functors in Definition 10 is a WLC. 



Proof. The proof of the fact that (Rel*, x*) is symmetric monoidal closed, with right 
adjoint of x* the product x* itself, and of the fact that the functors of Definition 10 
are monoidal is routine. We just sketch the definition of some natural transformations 
for !* = ( )*“. Let con :()*“=»()*“ x ( )*“, con A ■ ( A )*“ x (A)*“ be defined 

by coua = {(a, (a', a")) \ a, a', a" € A *" A Vi > 0. a!f i+l = cn A Vi > 1. a'f = a'}. 
Let weak a : —*r /, weak A — {(a“, *) | a £ A}. □ 



Theorem 3. The class of LCAs generated by reflexive objects in the WLC 
(Rel*, x*, f), where f is either ( )* <w or ( )*jC or ( )*“, is isomorphic to the 
class of finite-stream, finite- codomain stream, and stream LGMs, respectively. 

Proof. We consider the case !* == ( )* <u> . The other cases can be dealt with similarly. 
Using Theorem 2, one can easily check that any set U with retracts 91 : U x* U <3 U 
and #2 : U* <w <3 U gives rise to a LCA isomorphic, via Xu : I x* U — > U, to a finite- 
stream graph model. And vice versa. □ 

As a consequence, by Theorem 1, the LCAs generated by reflexive objects in 
the WLC (Rel*, x* , !*) are pointed LGMs. 

The situation is summarized in Table 1. Notice that, in the case of infinite 
streams (possibly with finite codomain), the cardinality of U is at least 2 N °. 
Moreover, for ! = ( )* w , we get a pointed Wi-LGM. 



Table 1. Wave graph models on Rel* . 



WLC 


UNIVERSAL OBJECT LCA 


(Rel* , X*, ( )* <u ) 


9t : U x U < U, 61 : U * <U1 <U =S> : P}U <1 U* <u> s.t. 

(P*U, ■«» , lejos) pointed LGM 


(Rel* , X*, ( )* f “) 


61 :U xU <U, e * 2 :Uf“ <U => 3? : PfU <1 U* f “ s.t. 

(P*U, ■«» , pointed LGM 


(Rel*, x*, ( )*“) 


91 : U x U < U, 62 ■■ U™ < U => 3? : P * <U1 U <1 U* u s.t. 

(P*U, , !a*o{) pointed au-LGM 



Remark 2. Notice that the basic category Rel of sets and relations fails to be a WLC, 
since there is no notion of “empty” stream, and the definition of weakening weak a : 
AC -e->7, weak a = {(a“,*) | a £ A, * £ 1} is not natural on the empty relation. This 
is the reason why one has to shift to pointed relations. 

Moreover, the powerset functor Vf fails to induce a structure of WLC on Rel (and 
Rel*). Namely, the “natural” definition of dereliction, i.e. der A '■ Vf(A) -e-t-A, der A = 
{({a}, a) | a £ A}, is not pointwise natural. 





‘Wave-Style” Geometry of Interaction Models in Rel 251 



3 The Geometry of Interaction Construction 

In this section, we recall the categorical axiomatization of the Geometry of In- 
teraction (Gol) developed in [1,2,3]. This is based on traced categories, see [23]. 
Any traced monoidal category gives rise, by the construction of [1], to a Gol 
category. If moreover the traced category we start from has a strong traced 
monoidal functor together with suitable retractions, then the Gol category is a 
WLC, and hence, by Theorem 2, it can generate a (Gol) LCA. This situation, 
called Gol situation, is axiomatized and studied in [3] . However, a Gol situation 
gives only sufficient conditions for a Gol category to be a WLC. In this section, 
we introduce the notion of weak Gol situation (wGoI situation) , in which we give 
weaker, but still sufficient conditions on the retractions for the Gol category to 
be a WLC. In the next section, we will see that the notion of wGoI situation 
captures many graph models introduced in Section 1. 

Gol categories arise from traced symmetric monoidal categories by the fol- 
lowing construction: 

Proposition 10 (Gol Construction, [1]). Given a traced symmetric 
monoidal category C, we define a Gol category G(C) by: 

— Objects: pairs of objects ofC, denoted by (A + , A~), where A + and A~ are 
objects ofC. 

— Arrows: an arrow f : (A+,A~) ->• (B+,B~) in G(C) is f : A + <g> B~ — >• 
A~ <g> B+ in C. 

— Identity: id (A +,A~) = a A+A ~. 

— Composition: it is given by symmetric feedback. Given f : (A + ,A~) — > 

(B+,B~) andg : (C+,C~), gof : (A+,A~) (, C+,C ~ ) 

is given by: go f = j4 _ 0C+ (Y o (/ ® g) o 7), where 7 = (id A + ® 

id B - ®cr c - B +)o{id A + ®(Tc- , B -®id B +) and 7' = {id A -®idc+®cr B +, B -) ° 
(id a- ® &b+,c+ ® id B ~) o (id a- ®id B + ® cr B - t c+)- 

— Tensor: (A + ,A~) ® (B + ,B~) = (A + ® B + , A~ ® B~), and, for any f : 

(A+,A~) (B+,B~) and g : (C+,C~) -»• (D+,D~), f ®g = (id A - 0 

cr B + c- ® id D +) o (f ® g) o (id A + <8> &c+,b- ®id D ~). 

— Unit: (1,1). 

Then G(C) is compact closed. Moreover, F : C — 1 G(C) with F(A) = (A, I) 
and F(f) = / is a full and faithful embedding. 

In [3], sufficient conditions are given on the traced monoidal category C for 
G(C) to be a WLC, and hence, by Theorem 2, to give rise to a Gol LCA. We 
recall this construction: 

Definition 11 (Gol Situation, [3]). A Gol situation is a triple (C,T,U), 
where: 

— C is a traced symmetric monoidal category; 

— T : C —> C is a traced strong symmetric monoidal functor with the following 
retractions (which are monoidal natural transformations) : 

1. e : TT < T : e! (Comultiplication) 

2. d : Id <1 T : d' (Dereliction) 
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3. c:T®T<]T:c , (Contraction) 

4- w : Kj <3 T : w' (Weakening), where Ki denotes the constant I functor. 
— U is an object of C, called a Gol reflexive object, with retractions 

1. 6>i : U® U < U : 6>j 

2. I < U 

3. 6*2 : TU <\ U : 0' 2 . 

Theorem 4 ([3]). Let ( C,T,U ) be a Gol situation. Then 

i) (■ G{C ),!) is a WLC with ! : G(C) — ► G(C) defined as follows: !(A + ,A~) = 
(TA+,TA~), and, for f : (A+ , A~) -> (B + , B~), If 4 TA+ ® TB~ 4 T(A+ ® 
H~) ^ T(A“ ® 5+) 4 TA" ® TB+. 

ii) ( C(U , [/), •, /) zs a LCA, where for any f,g € C(U, U), f ■ g — Tr^u((idu ® 
g) o (0j o / o 0i)), and If = 9 2 oTf o d' 2 . 

Definition 12. We call Gol LCA, (or Gol algebra^), a LCA which comes from 
a Gol situation. 

As pointed out in [3] , particle Gol situations arise when the strong monoidal 
functor is given by a countable copower. Dually, in the wave case, Gol situations 
arise when the monoidal functor is given by a countable power. In what follows, 
we focus on the wave case. An example of wave Gol category is ( uj-CPO , x) 
together with the stream functor ( )“, [3]. However, the basic setting for wave 
Gol, i.e. (ReT, x*) (and (Rel, x)) together with the stream functor fail to give 
rise to a Gol situation. The induced Gol category, however, is still a WLC. In the 
next definition we introduce the notion of weak Gol situation, which captures 
the basic case of (Rel* , x*). In a weak Gol situation, the naturality condition of 
the retractions is relaxed, by requiring only naturality up-to retraction : 

Definition 13 (Weak Gol Situation). A weak Gol situation (wGoI situa- 
tion) is a triple ( C,T,U ) where C and U satisfy the conditions in Definition 11, 
and T is a traced strong symmetric monoidal functor with retractions which are 
natural only up-to retraction, i.e.: 

1. {eA '■ TTA <1 TA : c' a }a (Comultiplication) is a family of monoidal retrac- 
tions s.t., for all f : A — > B, e' B o T f o eA = TTf; 

2. {dA ■ A <1 TA : d' A }A (Dereliction) is a family of monoidal retractions s.t., 
for all f : A B, d' B oT f o dA = / ; 

3. {ca : TA ® TA <] TA : c a }a (Contraction) is a family of monoidal retrac- 
tions s.t., for all f : A B , c' B o T f o ca = Tf ® Tf ; 

4- \wa ■ I <1 TA: w'Aa (Weakening) is a family of monoidal retractions s.t., 
for all f :A^B, w' B o Tf o w A = id /. 

It is immediate to see that a Gol situation is in particular a wGoI Situation. 
Moreover, a direct calculation shows that der, 5, con, weak, as defined in [3] in 
terms of the retractions on C, are monoidal pointwise natural transformations 
on G(C). Therefore: 

Theorem 5. A wGoI situation gives rise to a Gol category which is a WLC. 
The situation is summarized in Figure 1. 
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Gol situation 



> weak Gol situation 



Gol construction^ 

WLC 




universal object^ 

LCA 



standardization 



V 
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Fig. 1 . (Weak) Gol construction. 



4 Wave Gol Algebras and Linear Graph Models 

In this section, first we show that some pointed linear graph models can be viewed 
as wave Gol algebras arising from the traced category (Rel* , x*), together with 
suitable stream functors as monoidal functors. Moreover we show that such wave 
models can induce A-theories where not all unsolvables of order 0 are equated. 



4.1 Gol Algebras on (Rel*, X*) 

The category (Rel*, x*) is traced with the trace operator Tr\ AB ( ) defined by: 
for / : A x U -e-^B x U, Tr u A B (f) = {(a, b) | 3 u. (, a,u,b,u ) G /} . 

Both the functor of streams ( )*“ and that of streams with finite codomain 
( )Y induce on (Rel* , x*) a wGoI situation. We focus on ()*jF. 

Proposition 11. For any Gol reflexive object U in Rel* , (Rel*, x*, ( )^ u , U) is 
a wGoI situation. 

Proof. The functor ( is traced strong monoidal with isomorphism : I — >• If* , 
defined by <(>/(*) = and natural isomorphism <f> : ( )f* x ( — » ( 0 with 

components 4>a, b : Aj* x Bj* — » (A x BYf* defined by (f>A,B(a,b) = c, where d = 
(at, bi). We only sketch the definitions of the monoidal retractions: 

- eA : (Af UI )J u ’ -a AJ U , eA — C ° Xa, where \a '■ [N -Af co d [N —tfcod A]] <1 [N x 
N -A- food A] is a component of the retraction natural in A induced by curryfication, 
and £ = A/ G [N x N -A fcod A], An G N./(e- 1 (n)), where e : N x N ~ N is any bijective 
coding of pairs; 

- ca : Af* x Aj“ — »• Af*, ca(cl, a') = a " , where, for all i > 0, a^j+i = o» and, for all 
i > 1, a'j i = a'u 

- d A :A -»• A* f “, d A (a) = a", d' A : A}* -a- A, d' A (a w ) = a; 

- w A : I -e^Af*, w A = {(*, a") | a G A}, w A : Aj* — > /, w' A = {(a", *) | a G A}. □ 

Notice that d in the proof above is natural up-to retraction, but not in the 
full sense. This justifies the definition of weak Gol situation. 
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Remark 3. Notice that both the finite powerset functor Vf and the finite stream functor 
( )* <u ’ fail to be strong monoidal, and hence they do not give a Gol situation on Rel* . 
Moreover, the category (Rel, x) together with the (infinite) stream functor fails to give 
a wGoI situation, because weakening (as it is defined in the proof of Proposition 11) 
does not satisfy the condition of Definition 13 on the empty relation. 



Proposition 12. i) The Gol category Q(Rel* , x*,( )j u ) is a WLC. 
ii) Let U be a Gol reflexive object in Rel*, with retractions 9\ : U x* U <1 U , 
9 2 : U™ <1 U. Then (V*(U x U), !g*) is a LCA, where, -g* and !g * are defined 

by (using “ functional ” notation): for all x,y £ V(U x U) 

- x - e * y = {(a, b) \ 3 (c,d) £ y. (9{(a,d),9t(b,c)) £ x} ; 

~ ! e*x = {(02(a), 91(b)) \ a, b £ Uf u A \/i.(ai,bi ) £ x}. 

A crucial fact for our purposes is that all the Gol algebras of Proposition 12 
give rise, for a suitable choice of the coding relations, to finite-codomain stream 
LGMs, and hence, by suitable analogue of Theorem 1, to pointed LGMs. 

Theorem 6. LetU = (V*(Ux U), -g * , !g*) be a Gol algebra induced by (Rel * , x* , 
( )y w ), with 9* : U x U < U, 9% : <\ U . Then U coincides with the pointed 

LGMU' = (r*(U'),- T ;,! Ti ), where U' = U x U, rf : U' x U' < U' and r 2 * : 
Vf(U') <\ U' are defined by: 

T* = (0* o (lT 1 O 7T 2 X 7T 2 o 7Ti), 9{ o (7 T 2 O 7T 2 X 7Tl O 7Ti)) , 

r 2 = (0*2 X 9* 2 ) O ( , 

where f : Vf(UxU) x Uf u is the injective relation defined by (u, (a, b )) £ 

C iff f or f (ai,bf) £ u and for all (c,d) £ u there exists i such that ai = c 
and bi = d. 

Proof. Let x, y £ V*(U x U). By Proposition 12, x-g*y = {(a, b) \ 3(c, d) £ y. (9((a, d), 
6 t(b,c)) £ *} = {(a, b) | 3 (c,d) £ y. rf((c,d), ( a,b )) £ *}, by definition of rf; i.e. -g* is 
the application on the LGM U' . Moreover, by Proposition 12, \g*x = {( 62 (a), 82 (b)) \ 
a.beUf^ A Vi. (a,, bf) £ x} = {( 62 (a), 82 (b)) \ a, b £ Uf w A 3u C fne x. Vi.(ai, bi) £ 
u A V(c, d) £ u 3 i.(o» = cA6i = d)} = {(0| X0|)(C(«)) I u G,fne x} = {t 2 *(m) | u C /ne *}, 
by dehnition of r 2 ; i.e. is the ! operator on the pointed LGM U' . □ 

Notice that by considering the Gol construction over (Rel * , x*), we get a class 
of LGMs which is a subclass of the one obtained in Section 2, where (Rel * , x*) 
itself is viewed as a WLC. 

The broad abstract pattern is given in Figure 2. 

4.2 Wave Gol A-Theories 

The class of A-theories induced by wave models is quite rich, or at least it goes 
beyond theories where all unsolvable of order 0 are equated in the bottom ele- 
ment, as in models based on games a la [5,22] (see [14]). Namely: 

Lemma 1. For any k > 0, there exists a Gol algebra on (Rel * , x*, ( )*jF) with k 
self-singletons (up-to-*) different from * , i.e. elements a € U such that9{(a,a) = 
a and 9 2 ({a, *}) = a. 
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(Rel * , x*, ( )j“) weak Gol situation 

Gol construction 

g(Rel*,x*,()} u ) WLC 

universal object 

finite-codomain stream LGM 

is isomorphic to 

pointed LGM 

standardization 

pointed GM 



(Rel*, x*, ( )*“) weak Gol situation 

Gol construction 

g(Rel*,x*,()*“) WLC 

universal object 

stream LGM 

is isomorphic to 

u;i-pointed LGM 

standardization 

wi-pointed GM 



Fig. 2. Wave Gol constructions on Rel*. 



One can easily check that any self-singleton (up-to-*) a belongs to the in- 
terpretation of a term iff it reduces to a closed A-term (see [21]). Therefore: 

Theorem 7. There exists a wave model in which 
[Acc.Zizi] ^ [Ax.ZiZia;]. 

5 Conclusions and Future Work 

Building on [1,3], we have investigated the connections between graph models 
and wave Gol models arising in the basic setting of (Rel* , x*). We have shown 
that such wave models are (pointed) graph models, which yield models of the 
A-calculus capturing a rich class of A-theories. The category (Rel, x) apparently 
fails to give a WLC and a wGoI situation, because weakening is not well-behaved 
on the empty relation. 

However, in [20], in order to capture the case of (Rel, x), a strict variant 
of the Gol situation of [3] has been introduced, giving rise to a strict WLC, 
where only a restricted form of weakening holds. LCAs arising from strict WLCs 
are themselves strict, i.e. application is strict, and only a restricted form of 
K combinator is available. These are models for restricted A-calculi, such as 
Church’s Ai-calculus and the A/3iyAr-calculus of [19]. 

In summing up, we make a disclaimer. The objective of the paper is not that 
of characterizing A-theories arising from graph models or linear graph models, 
or wave-style graph models. This problem is very difficult, but orthogonal to the 
one we have addressed. We have shown nonetheless that the class of A-theories 
induced by Gol wave-style graph models is richer than that of game models. Our 
goal in this paper is rather that of showing that the basic intuitions underlying 
the applicative machinery of graph models can be subsumed in a more general 
setting, by a suitable relaxation of the categorical axiomatization of Gol provided 
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by Abramsky. Actually, graph models arising from Gol constructions generalize 
in various intriguing ways the original graph model construction. 

The present paper provides new insights in the theory of graph models, in 
that it opens up the possibility of applying Abramky’s paradigm of “splitting 
the atom of computation” also to this class of models. In the opposite direction, 
the paper illustrates that the expressive power of the Gol axiomatization goes 
well-beyond game-like models and subsumes also graph-like models. 

Finally, here is a list of intriguing open questions. Definitive answers or even 
well-motivated conjectures appear to be rather difficult. 

- Many classes of graph- like models have been considered in the literature, see 
[15]. Do they all induce the same A-tlreories as the class of standard GMs? 

- Can original LGMs and the generalizations of graph models in [15] be captured 
as wave WLCs? 

- Are all the theories of GMs induced by wave Gol algebras? Which theories 
escape Gol characterizations? 

- Are there particle-style A-algebras alternative to the ones based on game cate- 
gories, which induce a richer class of A-theories? 

- Finally, an interesting issue to be investigated is that of giving a logical charac- 
terization via intersection types [12] to the graph models arising from wave Gol 
constructions. We feel that this will shed more light both on intersection types 
and on wave models. 
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A Categorical Definitions 

We collect some categorical definitions. For more details (in particular for the definition 
of traced category), we refer to [3,23]. 

A monoidal functor between monoidal categories C and D is a triple (F, <j>, rf)), 
where F : C — ¥ T> is a functor, 0 is a natural transformation with components 4>a,b : 
FA 0 FB — » F(A® B) and : I — » FI is a morphism in T> such that the following 
diagrams commute 



FA 0 (FB <g> FC) — ^ (FA <g> FB) 0 FC 

idpA </><8 )idpc 

y 

FA 0 F(B 0 C) F(A®B)®FC 

I ^ 



F(A®(B®C)) 



F((A®B)®C) 
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I 0 FA - 

</>'j®id FA 

V 

FI ® FA ■ 



FA 

L 



FA 

idpAlgx/lj 



FA 

u 



■ F(1 0 A) 



FA 0 FI ■ 



• F(A 0 I) 



A monoidal functor is strong when <j> is a natural isomorphism and <p> is an isomor- 
phism. 

A monoidal functor F : C — > T>, with C and T> symmetric monoidal categories, is 
symmetric if the following diagram commutes: 



<f> a B 

FA 0 FB F(A 0 B) 



f&a.b 



4* B A 

FB 0 FA F(B 0 A) 



A strong symmetric monoidal functor F : C — > T) between traced categories is 
traced if, for all / : A 0 U -4 B 0 U, Tr%% B o Ff o <p A} v ) = F(Tr%(f)). 

A monoidal natural transformation m between monoidal functors (F, <j>, (p'j) and 
( G , ip, ip'j) is a natural transformation m : F => G s.t. the following diagrams commute: 



FA 0 FB feF(A 

i’A.B . 

GA 0 GF >■ G(A 

A monoidal pointwise natural transformation is a family of maps m. A : FA — > GA 
s.t. the naturality diagram commutes for morphisms of the form /:/—>■ A, for all 
object A. 
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Abstract. Coercive subtyping is a theory of abbreviation for depen- 
dent type theories. In this paper, we incorporate the idea of coercive 
subtyping into the traditional Hindley-Milner type systems in functional 
programming languages. This results in a typing system with coercions, 
an extension of the Hindley-Milner type system. A type inference algo- 
rithm is developed and shown to be sound and complete with respect 
to the typing system. A notion of derivational coherence is developed to 
deal with the problem of ambiguity and the corresponding type inference 
algorithm is shown to be sound and complete. 



1 Introduction 

The Hindley-Milner type system (HM system for short) [8] is the standard core 
of the modern typed functional programming languages. Various extensions to 
the HM system have been proposed in order to enrich a programming language 
with new and more powerful features. These include, for example, Haskell’s class 
mechanism [10], which provides convenient overloading facilities among other 
things. 

Coercive subtyping [14] is a theory of abbreviation developed in the setting 
of dependent type theories, where coercions are regarded as abbreviation mech- 
anisms and directly characterised in the proof system (type theory) extended 
with coercions. It has been implemented in several proof development systems 
[1,19,4] and effectively used in proof development (e.g., [1]). 

In this paper, we incorporate the idea of coercive subtyping into the tradi- 
tional HM type system. There are several motivations in studying the possible 
combination of coercive sub typing and traditional polymorphic typing systems. 
First, it leads to a novel approach that increases the power of the HM system 
with new abbreviation mechanisms, which we believe would be useful in various 
programming activities. Secondly, coercive subtyping provides a clean and simple 
theory for abbreviation in dependent type theories. Incorporating its ideas into 
traditional type systems may lead to simple theoretical development and better 
understanding of the more powerful facilities (e.g., overloading) found useful in 
programming. Thirdly, not the least important, studying coercions in polymor- 
phic type systems meets with new challenges, partly because type uniqueness 
simply does not hold in a polymorphic system. 

* This work is partly supported by the UK EPSRC grants GR/M75518, GR/R84092 
and GR/R72259, the EU TYPES grant 21900 on the TYPES project and by an 
EPSRC studentship. 



S. Berardi, M. Coppo, and F. Damiani (Eds.): TYPES 2003, LNCS 3085, pp. 259—275, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 
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One of the results of our work is a typing system with coercions, an extension 
of the HM type system, together with a sound and complete type inference 
algorithm. Since the HM system is polymorphic, where a term may have more 
than one type, the introduction of coercions has to be very careful; a naive way to 
introduce coercions causes problems. For example, one of the decisions we have 
made is that if a term is already typable in the original HM system, then no 
coercions will be inserted. This also conforms with the intuition and, in practice, 
an implementation of the extended system will not alter the meanings of the 
existing programs. 

We shall also study a notion of derivational coherence that is developed to 
deal with the problem of ambiguity of computational meanings of a term. A 
term may have different completions - there may be different ways to insert 
coercions to make a term typable. The notion of derivational coherence captures 
this and we have developed a sound and complete type inference algorithm for 
derivationally coherent terms. 

We regard the Hindley-Milner system as well-known and refer its introduction 
for example to [21]. In the remainder of this section, we give a summary of 
work on coercive sub typing and other related work. In Section 2, we give a 
brief introduction to our approach by considering several simple examples. The 
extended typing system with coercions is presented and explained formally in 
Section 3. In Section 4, the type inference algorithm is presented and proved to 
be sound and complete. Derivational coherence is introduced in Section 5, where 
we also give the corresponding algorithm and discuss the proofs of its soundness 
and completeness. We conclude with some discussions about future work. 

1.1 Coercive Subtyping 

Coercive subtyping is a framework of abbreviation for dependent type theories 
[14]. The basic idea is: if there is a coercion c from A to B, then an object of 
type A may be regarded as an object of type B via c in appropriate contexts. 
More precisely, a functional operation / with domain B can be applied to any 
object a of A and the application fa is definitionally equal to f(ca). Intuitively, 
we can view / as a context which requires an object of B\ then the argument 
a in the context / stands for its image of the coercion, ca. Therefore, the term 
fa, originally not well-typed, becomes well-typed and “abbreviates” f{ca). 

The second author and his colleagues have studied the above simple idea 
in the Logical Framework (and type theory), resulting in a very powerful the- 
ory of abbreviation and inheritance, including parameterised coercions and coer- 
cions between parameterised inductive types. In coercive subtyping, the coercion 
mechanism is directly characterised in the type theory proof-theoretically. Some 
important meta-tlreoretic aspects of coercive subtyping such as the results on 
conservativity, coherence, and transitivity elimination have been studied. They 
not only justify the adequacy of the theory from the proof-theoretic considera- 
tion, but provide the basis for implementation of coercive subtyping. See [1,4, 
13,14,20] for details of some of these development and applications of coercive 
subtyping. 
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Coercion mechanisms with certain restrictions have been implemented for 
dependent type theory both in the proof development system Lego [15] and Coq 
[2], by Bailey [1] and Saibi [19], respectively. Callaghan of the Computer Assisted 
Reasoning Group at Durham has implemented Plastic [4] , a proof assistant that 
supports the Logical Framework LF and coercive subtyping with a mixture of 
simple coercions, parameterised coercions, coercion rules for parameterised in- 
ductive types, and dependent coercions. 

Remark 1. Incorporating the idea of coercive subtyping to a polymorphic calcu- 
lus is not straightforward. Coercive subtyping has been developed in dependent 
type theories with inductive data types, which are rather sophisticated systems. 
However, most of them (or at least the standard ones) have the property of type 
uniqueness; that is every well-typed object has a unique type up to computa- 
tional equality. Compared with the polymorphic calculi such as the HM type 
system where an object may have more than one type, one may say that depen- 
dent type theories are ‘simpler’. It is important to bear this in mind when we 
consider combining coercive subtyping with a polymorphic calculus. 



1.2 Modelling Subtyping by Coercions 

Various notions of coercion have been studied in the literature, particularly when 
subtyping systems are considered. In subtyping, we have the subsumption rule, 
which says that if a : A and A < B, then a : B. This can be modelled by means 
of coercions (maps from A to B). In [3], this idea was proposed and used to give 
a coercion-based semantic interpretation of Cardelli and Wegner’s system Fun 
[5]. The idea of coercive subtyping discussed above was influenced by this work. 

People have used the term coercion to interpret subtyping simpler settings 
as well. For example, Mitchell [17,18] considers a system where conceptually a 
subtype is a subset and thus coercions essentially represent set inclusions. In [6, 
12], the term coercion is used to denote a special restricted form of mapping in 
modelling and explaining subtyping. 

Remark 2. Note that, because of the subsumption rule in subtyping, a term ob- 
tains more types, while in our setting, a term does not get more types. Rather, in 
coercive subtyping or the extended HM-system considered in this paper, where 
there is no subsumption rule, there are more well-typed terms, which are re- 
garded as abbreviations, and typing conflicts are resolved by the insertion of co- 
ercions. Furthermore, this is studied in the typing system at the proof-theoretic 
level. 



2 Some Simple Examples 

We consider the HM type system extended with coercions. Coercions are re- 
garded as abbreviations; more precisely, if a term is not we 11- typed in the orig- 
inal HM type system, and after inserting coercions it becomes well-typed, then 
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we regard the term to be well-typed and “abbreviate” the completed term with 
appropriate coercions inserted. 

We shall consider extending the HM type system with two forms of coercions: 
argument coercions and function coercions. By argument coercions, we mean 
that the argument of a function is coerced according to the typing requirement; 
more precisely, the term fa abbreviates f(ca) if / : a — > r, a : <tq, and there is a 
coercion c from <to to cr. By function coercion, we mean that a term in a function 
position is coerced into an appropriate function accordingly; more precisely, ka 
abbreviates ( ck)a if k : a, a : ao, and there is a coercion from cr to a function 
type cr 0 f t . 

In the following, we give some simple examples to explain the above basic 
idea. The first two examples explain argument coercions, while the last example 
about overloading explains how function coercions work. We assume that the 
types include integers (Int), floating numbers (Float), booleans (Bool), monads 
(Ter, where a is any type), and a unit type (called Plus). 



An Example of Basic Coercions 

The simplest example of coercions, as often used in programming languages, is 
to convert integers to floating point numbers. For example, we can declare 

int2f loat : Int — > Float 

as a coercion, either in a context or in a program by using the coercion dec- 
laration 1 cdec int2float : Int —> Float in . Then, assuming 2 : Int and 
plusone : Float — > Float, the term plusone 2 is typable and abbreviates its 
“completion” plusone (int2float 2), where the coercion int2f loat is inserted. 
Note that the completion is typable in the original HM system. More formally, 
we say that the term (or program) cdec int2f loat : Int — > Float in plusone 2 
has type Float. The function int2f loat here is represented as a constant in the 
typing system. It could be defined externally (e.g., using system call at runtime). 

This coercion is usually handled automatically by programming systems, 
without a formal explanation. We provide a principled explanation of this in a 
setting where we can, for example, formally answer coherence questions. Note 
that we can handle the converse coercion, from floating point numbers to integers 
using e.g. floor, in the same way. 



Using Coercions in Monads 

Monads are a commonly used vehicle in functional programming to deal with 
“imperative” features like state, random numbers, partial functions, error han- 
dling or input/output. Every Monad consists at least of a unary type constructor 
(called T here), an injection function (called “return” here) and a lifting function. 
We refer the reader for example to [22] for a full introduction. 

We leave out some type variable annotation; see Sec. 3.2 and 3.3 for more details 



l 




Coercions in Hindley-Milner Systems 263 



Coercions can ease use of monads, by allowing omission of the injection of 
a value into its “monadified” type (function return). T in the types for the 
examples below can be seen as the error monad. There are two ways to create 
values of this monadic type: one is a regular, good value (return : Va. ct -> Ta) 
and the other is to signal an error or exception (err : \/a.Ta). We can then 
define a reciprocal function, from Float to T Float, which captures the division 
by zero error: 



As. if (iszero x) err (return (sysdiv 1.0 a;)), 



where 

if : Va. Bool — > a — > a — > a 
iszero : Float — > Bool 
sysdiv : Float —> Float —> Float 

Using the coercion abbreviation mechanism, however, we can leave the return 
implicit by declaring it as a coercion: 



cdec return : Va. a — > Ta in 

Ax. if (iszero x) err (sysdiv 1.0 x) 



Similar situations occur frequently when a monadic programming style is used, 
making this a fairly useful abbreviation, both for code clarity and brevity. 

Note that, as shown by this example, coercions are not necessarily represent- 
ing simple inclusion between types (as considered in the setting of subtyping 
[17]). They are arbitrary functional maps which one wishes to omit, in prefer- 
ence to the abbreviated form. In particular, the intuition that a type that can 
be coerced into another type can be viewed as set-theoretic inclusion does not 
apply. 



Using Coercions for Overloading 

Coercions can be used to represent ad hoc polymorphism, or overloading. For 
example, assume that we have two functions for addition, one for the integers 
and the other for the floating point numbers: 

plusi : Int — > Int — > Int 

plusf : Float — > Float — > Float 

and we wish to use a single notation plus in both cases. This can be done by 
means of coercions. What we need to do is to consider a (unit) type Plus which 
has element plus : Plus and then declare the following two (function) coercions: 

cdec (Ax. plusi) : Plus — > (Int — > Int — > Int) 

cdec (As. plusf) : Plus — > (Float — > Float — > Float) 
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Then, we can use 

plus 12 or plus 1.0 2.5 

as intended, as these two terms abbreviate plusi 1 2 and plusf 1.0 2.5, respec- 
tively. 

Note that, in this example, the coercions are defined A-terms rather than 
just constants. It also shows that coercions are not just the same as a previously 
defined function. The idea of using unit types for overloading was studied by the 
second author [14]. See [1] for more applications of this idea. 

Remark 3. We considered Plus to be a unit type. In fact, there could be multiple 
elements in Plus (i.e. constants of that type), but they are all treated the same. 



3 Typing System 

3.1 Base Language 

Our starting point in this development is an existing programming language, 
namely a minimal polymorphic programming language with Hindley-Milner type 
system [8] which we call the base language. We assume readers are familiar 
with the basic ideas. We omit additional elements necessary to make this into 
a programming language, namely declaration of new types and recursion. This 
is because we focus on typing, and those features do not affect type checking. 
They can be added. 

The typing judgment in the base language is denoted by 

r I ~hm e : r, 

which can be read as “term e has type r in context T”. We are extending the 
base language with a coercion mechanism, which leads to our system K We shall 
explain in Section 3.4 how we can recover the HM system from our rules. 



3.2 Syntax and Notations 



Apart from coercion-specific extensions, we use standard notions of terms, types, 
type schemes and contexts [8]. The syntactic symbols to be used are as follows. 



Type variables 
Types 

<t, r, Q :: = a | a — > o 

(Object language) Variables 

x,y,z 



Contexts 

r, a 



Sets of type variables 

d, /?,7 ,e 

Type schemes 
H ::=Vd.<r 
Terms 

e, f,g ::= x \ ee \ Xx. e \ 

let x = e in e | 
cdec c : Vd. a — > er in e 



:: = 0 | r, x : y \ T, cdec c : Va. a — > r 
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Notations. The following notations will be used in our description of the sys- 
tem. 

— FV stands for the set of (object) variables declared in a context: FV{%) = 0, 
FV(r, x:n) = FV(r) U { x } and FV(r, cdec c : (jl) = FV{r). 

— FTV denotes the set of free type variables of a context, type, type scheme 
or term. It is defined as: 

FTV (r, x: n) = FTV (T) U FTV(n) FTV{ 0) = 0 

FTV (r, cdec c:n) = FTV (T) U FTV(c) U FTV(^i) 

FTV (a ->r) = FTV (a) U FTV(t) FTV (a) = { a } 

FTV (Vd. a) =FTV(a)\a FTV(x) =0 

FTV (e/) = FTV(e) U FTV(f) FTV(Xx. e) = FTV (e) 

FTV (let x = einf) = FTV (e) U FTV(f) 

FTV (cdec c : , it in e) = FTV(n) U FTV{e) U FTV(c) 

— Let r be a context. The coercion- free part of T is denoted by T, and defined 
as 0 = 0, r, x : /i = T, x : [i and T 7 = T, where T' is T, cdec c : Vd. er — >• r. 
Furthermore, we write T(a;) = /i if x : ji is an entry of T. 

— V0. er is a special case of Vd. er, denoting a type scheme with no bound vari- 
ables. We may omit V0 when the context makes it clear we denote a type 
scheme instead of a type. 

— cr^aM means that a is a generic instance of /i where all (free) type variables 
of a are in d. 

3.3 Judgment Forms and Rules 

The rules in fig. 1 define our typing system. The forms of judgments are: 

— r b Q e : t => e' . This should be read as “term e has type r and comple- 
tion e' in context F with free type variables d. We extend the usual typing 
judgment for ML-like languages F b e : r by allowing coercion declarations 
in the context, adding the completion e' and an explicit annotation for the 
free type variables which may occur in T, r and e. 

— F d- valid. To capture the notion that a “context F is valid with free type 
variables in d”, we write F d- valid. Note that this judgment is useful as we 
consider coercions in contexts subject to certain restrictions. 

— F b a a — > c t. This third form of judgment expresses that “coercion c from 
a to r can be derived from context F" . 

We also use the notation F \/hm e : ? to express the side condition that e is not 
typable in the HM system. 

Product Types. We can extend the language without affecting the basic results 
and mechanisms presented. For example some of the examples below will require 
the use of pairs. We can extend the language to add them to our language in 
the standard way, using the rules like the following: 

r h“ ei : Ti => e' x r b Q e 2 : r 2 => e' 2 
r F a (ei,e 2 ) : n x r 2 => (e^e^) 



Pairln 
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Cld 

CVar 

CCoer 

Id 

Abs 

App 

ApPac 

Appf c 

Let 

Decl 

Lup 



0 Q-valid 
r Q-valid 



x$FV(r), 

F,x:p d- valid FTV(p) C a 



F Q-valid r |- 0fU/3 Co • <7 — i t =>■ c 
r, cdec c : V/3. a — > r Q-valid 

r d-valid 



dD/3 = 0 



r h x : t =*• x 

r, x : V0. cr e : r =>■ e' 

F h“ \x. c : g — y t =>■ Xx. e ' 

r h“ ei : (7 -s- T => el T b“ e 2 : cr 



TdaP, r(x) = pL 



e 2 



T b“ eie 2 : r => eie' 2 

r h“ ei : a -> r =» el r b“ e 2 : <r 0 =>■ e 2 

r h" qp -> c a 

F b“ eie 2 : r =*• e\[c.e! 2 ) 

r b“ ei : £-o =b ei F h“ e 2 : a => e' 2 
I A £?0 — (cr — t t) 

r b“ eie 2 : r =>■ (ce^e^ 

r b“ u/3 ei : o' => e'j F, x : V/3. a b“ e 2 : r 



F Vhm e'ie 2 : ? 

F t/ffM eie 2 : ? 
e 2 



F b Q let i = ei in e 2 : r =>■ let x = e[ in e 2 



d fl /3 = 0 



r => c 

r, cdec c' : V/3. cr — » r b“ e : g =>■ e' 
F b a cdec c : V/3. a — > t in e \ g =£- e 



q D /3 = 0 



a — » r -<c 



F, cdec c : V/3. co — > 7o, F b“ cr — > c r V/3. ctq — » to 



Fig. 1. Typing Rules 



The results and the type checking algorithm can be extended in straight-forward 
ways. 



3.4 Explanations 

We give some informal explanations and prove some basic properties of the 
system presented above. 



Completion and Relation to HM. The above system is an extension of 
the system \~hm hr the sense that, if we remove rules App ac , Appf c , Decl, 
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Lup and CCoer and the notation of completion, the resulting system is equiv- 
alent to Hindley-Milner typing. We say that a program e is well-typed if 
0 b“ e : r => e' for some type r, completion e', and set of type variables a. 

An addition to the language is completion. Informally, we insert all the 
needed coercion functions in a term e to form its completion e' , such that the 
completed term is typable in the system without the coercion rules App ac , Appf c 
and CCoer , i.e. in the base language I ~hm- This is formally captured by lemma 1 , 
which will establish the relationship between our typing judgment b and that of 
the base language \~hm- It makes precise why we call e' “completion”: because 
the completion is an expansion of the term e in question, and this completion 
type checks in the base language. 

Definition 1 (Term Expansion). The notion that a term e2 expands a term 
e\, in symbols e\ < e 2, is inductively defined as follows, 
x < x 

Xx. e\ < Ax. e2 if e± < e2 

let x = ei in e2 < let x = e$ in if e\ < e$ and e2 < e 4 

eie2 < 6364 if e 1 < e3 and e2 < e4 

£2^3 < (eie 2 )e 3 

eie 3 < ei(e 2 e 3 ) 

cdec c : Vd. cr — y r in e < e 

ei < e3 if ei < e2 and e2 < e3 



Lemma 1 (Completion). If T \- a e : t => e' , then T \~hm e' : t, and e < e' ■ 

Proof Sketch. We prove the following two statements by simultaneous induction 
on the derivations of f h“ e : r => e' and T d-valid. 

— If r b a e : t => e' , then T \~hm e' : r and e < e' . 

— If r d-valid and M- Q cr — > c r, then T b HM c : cr t. 

Free Type Variables a. The handling of type variables needs some expla- 
nation. The standard notation of typing judgment assumes that the free type 
variables in T can be chosen arbitrarily. On the other hand, we require that 
all variables must either be bound or chosen from the a denoted in the judg- 
ment. Formally, the role of the free type variable annotations is captured by the 
following lemma which has three parts, for each of the judgements. 

Lemma 2 (Free type variables). 

1. If r \- a e : t => e', then FTV (T) C a, FTV (e) C a and FTV{t) C a. 

2. If r a-valid, then FTV(r) C a. 

3. If T \- a cr -+ c t, then FTV (T) C a, FTV (a r) C a and FTV(c) C a. 

By explicitly denoting all possible free type variables, we no longer require 
the notion of “generalisation” in the formulation of the Let rule, which, in our 
opinion, clarifies its intention. 




268 R. Kiefiling and Z. Luo 



Remark ). Another way of looking at this is that there are no free type variables, 
but all type variables are bound - some explicitly in type schemes, while all others 
are bound by the global quantification Vd. To our knowledge, this is the first 
time this reformulation of the Let rule is published. It is due to McKinna [16] . 

In the rule Abs, we add x to the context, quantifying over no variables. This 
means that all type variables in a are non-generic and cannot be instantiated in 
the derivation of e : r. This is in contrast to the Let rule which allows generic 
type variables. 



Global and local coercions. Besides assignments of types (more precisely type 
schemes) to variables, our contexts also contain declarations of (global) coercions, 
of the form cdec c : Vd. a — > r in e. The form of coercions is unlimited and can 
be any expression in the base language, like a constant function between base 
types or a function between arbitrary types computing the result in a complex 
way. The coercions declared in a context are well-typed and can be looked up 
by means of the rule ( Lup ). We have 

Lemma 3. If F b“ a — > c r, then r b“ c : cr — > r => c. 

In fact, we know that any declared coercion is well-typed in the HM system (c.f., 
Lemma 1). 

Besides global coercions, we also allow local coercion declarations in pro- 
grams, similar to the way let works. 

Example 1 (Localised Coercions). This example shows the scope of coercion dec- 
laration. In A = plusone : Int — > Int, 1 : Int, 1.0 : Float, plus : Float —> Float — > 
Float, the following program is well- typed. 

plus (plusone 1) 

(cdec floor : V0. Float — > Int in plusone 1.0) 

However, since the coercion is not available when plusone is first used, the 
following is not typable in A: 

plus (plusone 1.0) 

(cdec floor : V0. Float — > Int in plusone 1.0) 

Rules for argument and function coercions. Let us have a closer look 
at the special rules App ac and Appf c for argument and function coercions, in 
particular on their side condition. By F I ~/hm : ? we mean that is not 

typable in the base language, i.e. there is no type r such that F \~hm e 'i e 2 : r. We 
illustrate the necessity of this side condition with an example which shows that 
otherwise ambiguity arises which would lead to non-unique meaning of certain 
terms. 
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Example 2. We assume that A and B are any base types inhabited by the con- 
stants a : A and b±, 62 : B, and we have product types. Using the abbreviations 

U=cdec \(x, y). (61, 62 ) : V0. A x A —> B x B 

f=X(x,y).x 

g=X(x,y). (y,x) 

we can obviously derive 



r\- a f B x B ^ B => f 
r\- a g\ AxA^-AxA => g 
r \- a g : B x B — > B x B => g 

Thus using App ac without the side condition r \/hm e '\ e 'i '■ ■ we could derive 
the following, where c = X(x, y). (b±, 62): 

B b“ f(g(a,a}) : B => f(c(g(a,a))) 

B b“ f{g{a, a)) : B => f(g(c{a,a))) 

However, f(c(g(a, a})) computes to b\ while f(g(c(a, a})) to 62- This is a very 
bad situation, since it means that evaluation can no longer be uniquely defined, 
and thus the term f(g(a , a)) no longer has a definite, unique meaning. 

The side condition prevents this particular ambiguity, by forbidding the use 
of App ac and Appf c when App can be used. In other words, it gives preference to 
derivations which does not involve coercions, and a coercion may only be applied 
if needed since otherwise typing would fail. The side condition is decidable, for 
example by traditional algorithm W. This side condition does not prevent all 
forms of ambiguities, however. Section 5 discusses how to deal with them. 

The example shows an essential difference to coercive subtyping in Type 
Theory with its unique and explicit typing, where the type of g would fully 
determine the type of the coercion function to apply and whether a coercion is 
needed at all. 

The side conditions on rules App ac and Appf c have another effect too. In co- 
ercive subtyping for Type Theory, the question arises whether identity coercions 
(i.e. the identity function declared as coercion) are allowed. We do not forbid 
them, but these side conditions ensure that they will never be used, since an 
application with an identity coercion can always be typed without it. 



Let Expression. One noticeable feature of our typing rules is that there are no 
coercion-specific rules involving let. Corresponding to the rules for application, 
one might expect to find something like: 

r \- aU P e\ : (To => e-'i -T, x : V/3. a h“ e 2 : r => e' 2 

r a 0 -» e g_ 

r \~ a let x = e\ in e2 : r => let x = ce! x in e' 2 



Let, 



an/3 = 0 
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With this rule basic soundness conditions still hold, like lemma 1 saying that 
the completion is well- typed in the base language. Thus it is not obviously wrong 
to add thus rule. Simple examples show that Let c is not admissible. Consider 
(assuming A , B , C and D are any types) r = x : A, c : A —y B , cdec c : A — » B. 
With the new rule we can then derive let y = x in y : B, without it we cannot. 

Another example shows the complication of the rule Let c . With r = a : 
A, b : B , c\ : A — y ( (’ — y _Z4), ^2 • 33 — y G, cdec c\ : A — y (C — y 
D ), cdec C 2 : B — )■ C and assuming the Let c rule is present, we are able to derive 
r h® let x = a in xb : D => let x = Cia in C 2 b. Essentially, this amounts to a 
simultaneous use of functional and argument coercions which is not admissible 
in our rules. 

These examples illustrate that Let c would allow a more liberal use of coer- 
cions. Our intention however is to restrict the situations in which they can occur 
to allow a formulation of derivational coherence (see section 5). A consequence 
of a rule like Let c is that a type checking algorithm (Section 4) would need to 
search for a which is not present in the conclusion of the rule; this may cause 
difficulties. 

4 Type Checking Algorithm 

The previous section describes our type system which adds coercions to Hindley- 
Milner type systems. The rules in fig. 1 describe well-typing, but they do not 
provide a decision procedure to verify well-typedness. This is mainly due to the 
application rules ( App , App ac and Appf c ), in which the argument type a cannot 
be inferred from the typing judgment whose validity is to be verified, and thus 
there are infinitely many derivation trees to check. 

This section provides a different set of rules to resolve this problem (fig. 2). 

4.1 Algorithm 

In the tradition of algorithm W [8], the rules in fig. 2 describe typing for most 
general types. These rules can be read as an algorithm, which we call “algo- 
rithm Wc”, to give non-deterministic answers to the question: “Given T and e, 
what are the type and completion of e?” . The inputs are context T and term e 
and the outputs substitution S, type r and completion e! . It is non-deterministic 
because of the rules LCdec Y and LCdec^ , where multiple coercions c can be 
found for a given pair of types a and r. In Section 5 we will provide a determin- 
istic algorithm together with a characterisation of its modified behaviour. 

Wc is presented with judgments of the following forms: 

— r b w e (S', r, e !) , which can be read as “In context T, term e type checks 
to substitution S, type r and completion e! . 

— r valid expresses that “T is valid”; in particular, it means that the 
coercions declared in it are well- typed. 

— r h L t c stands for “in context B, the lookup for a coercion from 
type a to type r yields the coercion term c”. 




CId w 

CVar w 

CCoer w 

Id w 

Abs w 

App w 

Let w 

Decl w 

Unc w 

Unc £ 
Unc™ 

LVar w 

LCdec w 
LCdecf' 
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0 valid 



r valid 
.T, x:/x valid 



X (jL FV{T) 



valid fl- w cvjr,S,c') 
r. cdec c : V/?. a t valid 



F,x -.Mai, . . . ,a n -T,r' ^ valid 
r, x : Van, . . . , a„. r, r' b w x ([/% / a»]r, 0, x) 



/3i new 



_T, x : V0. a b w e (r, S o { a i->- <r }, e') 
C b w Ax. e (a — > r, S', Ax. e') 



a new 



r h w ei (Si,Ti,ei) 

SiFb^ e 2 ^(S 2 ,a 2 ,d 2 ) 
unify c (r, S 2 ti,t 2 , e' 1; e^) (T, e' 3 ) 

r b w eie 2 ^ (TS 2 T 1 ,ToS 2 oS 1 ,e' 3 ) 

F b w ei (n, Si, ei) 

Sir, x : Gen(n, Sir) h" e 2 ^ (r 2 , S 2 , e' 2 ) 
F \- w let x = ei in e 2 

(r 2 , S 2 o Si, let x = ei in e' 2 ) 



r b w (go, 0, d) 
r, cdec d : V/3. a — Y t b w e {g, S, e') 
r b w cdec c : V/3. a — i t in e (g, S, e') 

unify(/3, t) = T 

unify c (r, p -¥ a, r, ei,e 2 ) (T, e ie 2 ) 

unify(/3 —>■ t, ao —>■ cr) = T 
F \~l cro — > a c unify((3, r) 

unify c {r, (5 -> a, r, ei, e 2 ) (T, ei(ce 2 )) fails 



unify{(3 — > (r — > n), ao er) = T 
F \~l cro —> cr c unify((3, r) 

unify c (F, (3, r, ei, e 2 ) (T, (cei)e 2 ) fails 



r \- L tj — ^ T c 

C, x : /rb l a — » r c 

7 bi a— >r~>c 

C, cdec Co : V/3. do -> ro bi a -> r c 



r, cdec c : V/3. cr — > r bi a - > t c 



Fig. 2. Algorithm Wc 
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We use the standard notion of first-order unification unify. It is easy to see 
that the traditional algorithm W can be recovered from the rules in fig. 2 by 
removing rules CCoer w , Decl w , Unc ™ and UncJ^ . (In that case b l will not be 
used either.) Using this observation and soundness and completeness of W, we 
can see that the condition on c in rule Decl w is actually the same as in Decl in 
fig. 1. 

Note that the side condition of rules App ac and Appf c in fig. 1 refers to the 
separate system of HM typing, while the implementation uses a simple unifi- 
cation test in Unc w and does not need to refer to a separate type checking 
algorithm. 



4.2 Soundness and Completeness 

Algorithm Wc (fig. 2) is a sound and complete implementation of the typing 
rules (fig. 1), in the following sense. 

Soundness expresses that the computed result type and completion can be 
derived using the typing rules. 

Theorem 1 (Soundness). Assume that we can derive r b w (r, S, e') . 
Let a = FTV(Sr,T,e). Then ST b a e: t => e' . 

Proof Sketch. We can prove this by strengthening it with the additional condi- 
tion if T valid and a = FTV(r), then T a- valid. We then do simultaneous 
induction on the derivations of r b w e (r, S, e!) and F ^ valid, using much 
of the structure and lemmas from [7]. Use of Unc w in App w by the algorithm 
corresponds to rule App, whereas Unc ™ and UncJ^ correspond to App ac and 
Appf c , resp. 

Completeness means that for any given completion, every derivable type for 
a term is an instance of the type computed by the algorithm for the result with 
this completion. 

Theorem 2 (Completeness). If ST b“ e : r =>• e', then there are exactly 
one type a and substitution T such that r b w e {a, T, e!) , and there is a 
substitution U with t = Uc r and SF = UTT. 

Proof Sketch. The proof uses induction on the derivation of ST b“ e : r => e' . 
Thus when looking for the right derivations for Wc(T; e) to prove the theorem, 
we already know the completion in the result. This completion resolves possible 
ambiguities in the choice of rules Uncfff, or Unc%. 

5 Resolving Ambiguities 

The rules in fig. 2 allow certain ambiguities, that can occur if there is more than 
one matching coercion during coercion search in Unc w . Assume, for example, 
that A and B are base types and r is / : a x a — > a, a : A, cdec a : V0. A 
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Ax A, cdec C 2 : V0. A — »• B x B . Then we have both r b w fa (A, 0, f{c\a)) 
and r b w fa ( B , 0 , f{c 2 a )) . 

Such a situation is not desirable, since it means that the evaluation behaviour 
is not uniquely defined. This is the coherence problem which needs to be ad- 
dressed for any system of (coercive) subtyping. 

We can solve this problem by replacing unify c in App w by unify }, which 
succeeds if and only if unify c returns a unique result: 

Definition 2 {unify},). unify}{r, (3, r, e) (T, /) if unify c {r, (3, r, e) (T, /) 
and for all U, g such that unify c {r , (3, r, e) (f7, 5 ) , U = T and f = g. 

unify} is effectively decidable since unify c is decidable and can only return a 
finite number of results. 

We call algorithm W} the algorithm obtained from Wc where the App w 
case uses unify} instead of unify c , and b]^ for the corresponding judgment. 
Algorithm W} can return at most one result, and is therefore a deterministic 
algorithm, in contrast to non-deterministic Wc- 

These additional side conditions clearly limit the cases in which the algo- 
rithm succeeds. This still allows all the examples presented earlier. However the 
question is how this restricted behaviour can be described in the typing rules. 
For this, we introduce the notion of “derivational coherence”. 

Definition 3 (Derivational Coherence). A term e is derivationally co- 
herent over a context r if for each subterm f of e and A b“ / : t\ => e x and 
1 ) f° / : t 2 => A occurring anywhere in any derivation of T b a e : r => e for 
any t\, T 2 , e\ and e' 2 , the two completions are the same, i.e. e[ = e' 2 . 

Using this notion, we can formulate a soundness and completeness result for 
W}. 

Theorem 3. For all r, e, the following holds. There are r, S and e! such that 
r b™ e ^ (r, S, e!) if and only if e is derivationally coherent over T and there 
are cr, f and a. such that f b“ e : a =$■ f . In both directions, e! = /' and 

For the proof we note that the derivation trees for typing derivation and for 
type checking are isomorphic, and thus we can establish the conditions in which 
ambiguities occur by an inductive analysis of them, using the previous soundness 
and completeness results (Theorems 1 and 2). 

6 Conclusion 

We have presented an extension of the Hindley-Milner polymorphic system with 
coercions by incorporating the idea from coercive subtyping. The extended typ- 
ing system can be further enriched with other features such as records whose 
associated inheritance relation can be represented as coercions. More details of 
the work, including a prototype implementation of the extended system and the 
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details of the proofs, can be found in the forthcoming thesis of the first author 

[ 11 ]- 

Tlrere are several issues to be further studied. For example, in our rules we 
have not included “transitivity” as found in general subtyping or coercive sub- 
typing systems. For basic types, adding transitivity of coercions is not a problem; 
it simply becomes a decidable search problem of the transitive closure of the co- 
ercions between basic types, representable as a finite graph [19]. However, when 
coercions parameterisecl over type variables are considered, as they are allowed 
here in general, it is not clear to us that the coercion search with transitivity is 
decidable. 

Coercion rules are another field of further study (e.g., see [14]). The current 
system would allow to add rules to derive new coercions from the rules already 
declared, like lifting of coercions over lists. The requirement is that coercion 
search must be decidable. 

As mentioned in the introduction, coercion search for type theory is facili- 
tated considerably by the unique typing property. That is no longer given, how- 
ever, if metavariables are added. Thus we can look to apply the techniques of 
this paper to type theory with metavariables. 

Coercion mechanisms as discussed in this paper facilitate overloading among 
other things. Another mechanism for overloading is the class mechanism in 
Haskell [23,10]. An interesting research topic is to compare these mechanisms 
formally and consider a possible general framework for abbreviations. 

Acknowledgements. We thank Paul Callaghan and James McKinna for dis- 
cussions and comments on a draft. 
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Abstract. Coherence is a vital requirement for the correct use of coer- 
cive subtyping for abbreviation and other applications. However, some 
coercions are incoherent, although very useful. A typical example of such 
is the subtyping rules for U-types: the component- wise rules and the rule 
of the first projection. Both of these groups of rules are often used in 
practice (and coherent themselves), but they are incoherent when put 
together directly. In this paper, we study this case for Y-types by in- 
troducing a new subtyping relation and the resulting system enjoys the 
properties of coherence and admissibility of substitution and transitivity. 



1 Introduction 

Coercive subtyping for dependent type theories, as studied in [16,17] and imple- 
mented in proof assistants such as Lego [19], Coq [3] and Plastic [7], is a powerful 
abbreviation mechanism and has been used in applications of proof development 
( e.g . [2]). An important requirement of coercion mechanism is that of coherence, 
that is, coercions between any two types must be computationally equal. This 
requirement is essential for the consistent use and correct implementation of the 
mechanism. 

A coherence problem. Some coercions cannot be put together directly in 
a coherent way, although very useful. A typical example of such coercions is 
those concerning Y-types (types of dependent pairs) . There are at least two sets 
of natural and useful coercion rules: the component- wise subtyping rules and 
the rule of the first projection. They are coherent separately (see [13] for the 
coherence of the former and [2] for the use of the latter), but incoherent when 
put together directly (see the counter example in section 3 for details). This 
prevents them from being used together. 

Our solution to this coherence problem is basically, in this paper, by introduc- 
ing a new subtyping relation and giving a new formulation of coercive subtyping, 
to ensure that there is only one coercion (with respect to computational equality) 
between any two types (if there is one) . 

Transitivity. This new formulation not only satisfies coherence requirements 
but also enjoys other properties, particularly, the admissibility of substitution 
and transitivity because such properties are important for an implementation of 
coercive subtyping. Through our investigation, we found out that the property 
of admissibility of transitivity is actually very hard to come by. In this paper, 
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we shall consider two subtyping relations simultaneously, give new transitivity 
rules and prove that all of them are admissible. 

In section 2, we shall give an overview of coercive subtyping and introduce 
some concepts such as Well-Defined Coercions and notations to be used later in 
the paper. In section 3, the coherence problem and its solution will be intuitively 
explained through a counter example. In section 4, there is a formal presenta- 
tion of the solution. A new definition of coherence, new rules of substitution 
and transitivity are also given. Some important properties, coherence and the 
admissibility of substitution and transitivity, are proved. Discussions are in the 
last section, where we discuss issues such as decidability and wider applications 
of the methods developed in this paper. 

2 Coercive Subtyping and Well-Defined Coercions 

In this section, we give an overview of coercive subtyping, introduce some no- 
tations and the concept of well-defined coercions that will be used later in the 
paper. 



2.1 Coercive Subtyping 

The basic idea of coercive subtyping, as studied in [17], is that A is a subtype of 
B if there is a (unique) coercion c from A to B , and therefore, any object of type 
A may be regarded as object of type B via c, where c is a functional operation 
from A to B in the type theory. 

A coercion plays the role of abbreviation. More precisely, if c is a coercion 
from A'o to K , then a functional operation / with domain K can be applied to 
any object ko of Kq and the application f(ko) is definitionally equal to /(c(fco)). 
Intuitively, we can view / as a context which requires an object of K ; then 
the argument ko in the context / stands for its image of the coercion, c(fco). 
Therefore, one can use f(ko) as an abbreviation of /(c(fco)). 

The above simple idea, when formulated in the logical framework, becomes 
very powerful. The second author and his colleagues have developed the frame- 
work of coercive subtyping that covers variety of subtyping relations including 
those represented by parameterisecl coercions and coercions between parame- 
terised inductive types. See [17,2,7,18,8] for details of some of these development 
and applications of coercive sub typing. 

Some important meta-theoretic aspects of coercive subtyping have been stud- 
ied. In particular, the results on conservativity and on transitivity elimination 
for subkinding have been proved in [11,21]. The main result of [21] is essentially 
that coherence of basic subtyping rules does imply conservativity. These results 
not only justify the adequacy of the theory from the proof-theoretic consider- 
ation, but also provide the proof-theoretic basis for implementation of coercive 
subtyping. 

How to prove coherence and admissibility of transitivity at the type level has 
been studied in [13] recently. In particular, the concept of Well-defined coercions 
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has been developed, and the suitable subtyping rules for 77-types and 17-types 
have been given as examples to demonstrate these proof techniques. 

Coercive subtyping is formally formulated as an extension of (type theories 
specified in) the logical framework LF 1 , whose rules are given in [15]. Types in 
LF are called kinds. The kind Type represents the conceptual universe of types 
and a kind of form (x : K)K' represents the dependent product with functional 
operations / as objects ( e.g ., abstraction [x : K]k') which can be applied to 
objects of kind Tv to form application f{k). For every type (an object of kind 
Type), El(A) is the kind of objects of A. LF can be used to specify type theories, 
such as Martin-Lof’s type theory [20] and UTT [15]. 

As presented in [17], a system with coercive subtyping is an extension of any 
type theory specified in LF by a set of basic subtyping rules 1Z whose conclusions 
are subtyping judgements of the form T b A < c B : Type. And the subtyping 
rules in 1Z are supposed to be coherent. 

Notation. We shall use the following notations: 

• We sometimes use M[x] to indicate that variable x may occur free in M. 

• Context equality: for T = x i : Ki, ...,x n : K n and T' = x i : K[, ...,x n : K ' n , 
we shall write b T = T' for the sequence of judgements b K± = K[, ..., 
xi . 7v i , . . ., x n — i . Tvf n _ r b K n — A n . 

• Types of non-dependent pairs: if A and B are types, we sometimes write 
A x B for E(A, \x : A]B) where x is not free in B. 



2.2 Well-Defined Coercions 

Recently, a new concept of Well-defined Coercions (WDC) has been developed in 
[13]. Suppose there is a set of coercions, which is coherent and have admissibility 
properties, we prove that, after adding new subtyping rules, the extended system 
still keeps the coherence and admissibility properties. 

Definition 1. (Well-defined coercions) If C is a set of subtyping judgements 
of the form T b M <d M' : Type which satisfies the following conditions, we 
say that C is a well-defined set of judgements for coercions, or briefly called 
Well-defined Coercions (WDC/ 

7. (Coherence) 

a) T b A < c B : Type G C implies T b A : Type, T b B : Type and 
r b c : (A)B. 

b) T b A < c A : Type / C for any T, A, and c. 

c) T b A < Cl B : Type G C and 7 b A < C2 B : Type G C imply T b ci = 

c 2 : (A)B. 

2. (Congruence) T b A < c B : Type G C, T b A = A' : Type, T b B = B' : 
Type and T b c = c! : ( A)B imply 7bd' < c / B' G C. 

3. (Transitivity) 7 b A < Cl B : Type G C and T b B < C2 A! : Type G C imply 

T \- A <c 2 oci A' : Type G C. 

1 The LF here is different from the Edinburgh Logical Framework [10]. 
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4- (Substitution) T, x : K , T' h A < c B : Type € C implies for any k such that 
T\-k: K, T, [, k/x]T ' h [k/x]A <[k/ x }c [k/x\B : Type £ C. 

5. (Weakening) T h A < c B : Type £ C, T C T' and T' is valid imply 
T' £ A < c B : Type £ C. 



In this paper, the set 1Z of basic coercion rules includes the following rule, where 
C is a WDC: 



(' WDCrule ) 



T b A < c B : Type £ C 
T b A < c B : Type 



3 The Basic Subtyping Rules and the Coherence Problem 



In this section, we give an example to illustrate the coherence problem of the 
component-wise subtyping rules for A-types and the subtyping rule of its first 
projection and explain informally the solution through a counter example. 



Subtyping rules for A-types. As studied in [13], there are three component- 
wise subtyping rules for A-types. One of these rules is the following. 



( First Component rule) 



T b A < c A' : Type T h B : (A')Type 
T b S(A, Boc) < dl E(A', B) : Type 



where d\ = [z : S(A, B o c)]pair{A ’ , B, c(ni(A, Boc, z)),tt2(A, Boc, z)), which 
basically means that, for example, Ax B is a subtype of A! x B if A , A! and B 
are types and A is a subtype of A! . 

The coercion of the first projection is very useful; for example, it is used sig- 
nificantly in Bailey’s PhD thesis [2] for formalisation of mathematics. Formally, 
the subtyping rule is the following: 



(7Ti rule) 



T h A: Type T h B : (. A)Type 
T £ E{A, B) <7r x (A,B) A : Type 



With this coercion, it is very easy to express some mathematical properties. For 
example, the type of collection of groups is a subtype of the type of semi-groups 
(i.e. a group is also a semi-group). Any functional operator with the domain of 
semi-groups can be applied to any group with a coercion. 



A counter example. If the subtyping rule (irirule) and the component-wise 
subtyping rules for A-types are combined together, we would have the following 
two derivations. 

The first derivation is 

T b A : Type T \- B : ( A)Type T \- B : (A)Type 

r h S(A, B) : Type T £ B o tti (A, B) : {B{A, B))Type 

r£E(E(A,B),Bon 1 (A,B)) < dl B(A, B) : Type 



and the rule (mrule) is used in the last step. 
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The second derivation is 



r b A : Type T \- B : ( A)Type 
(nru,e) Trwm .1 : Type 



T h B : (AjType 



r h £(£(4 B),Bo 7 n(4 H)) < d2 27(4 H) : Type 



and the rule (7Ti rule) is used in the first step and the First Component rule is 
used in the last step. 

There are two coercions d\ and d 2 from type B(S(A,B),B o m(A,B)) to 
type 17(4 B) 2 and we have the following equations 



di (pair (pair (a, b\), b 2 )) = pair(a, b \ ) 
d 2 (pair(pair(a, bi),b 2 )) = pair(a, b 2 ) 



We can see that d\ and d 2 are neither computationally nor extensionally 
equal. Hence, the vital requirement of coercive subtyping system, coherence, 
fails. 



Informal explanation of our solution. From the above counter example, we 
see that the existence of the two derivations makes the system incoherent. To 
make it coherent, a natural way is to block one of the derivations. The first one 
cannot be blocked, otherwise we lose the meaning that the first projection (7Ti) 
is regarded as coercion. And hence we can only block the second derivation. 
More precisely, we must not allow T b A < c A! : Type is used as the first 
promise of the component-wise subtyping rules if it is (directly) derived from 
TTirule. In other words, a condition of the component-wise subtyping rules is 
that the first promise is not (directly) derived from TTirule. There are several 
attempts to satisfy this condition, one of which is to consider a notion of size as 
a side-condition because A is a sub-term of E(A, B) in the conclusion of TTirule, 
and their sizes are intuitively different. However, the well-definedness of size is 
problematic when we present the whole subtyping system (see discussion section 
for more details). 

In this paper, rather than thinking of any side-conditions, we introduce a new 
subtyping relation (-<) to represent coercion 7Ti. This new subtyping relation will 
never appear in the first premises of the component-wise subtyping rules and 
hence the unwanted derivations such as the second one in the counter example 
are blocked. 

To make the subtyping system coherent is one thing; to make it also enjoy 
the property of admissibility of transitivity is another. During our investigation, 
we experienced that some formulations satisfy the property of coherence, but not 
the admissibility of transitivity. The formulation in the next section will enjoy 
all these properties. 



There are two different coercions from (A x B) x B to A x B if A and B are types. 



2 
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4 A Formal Presentation 

In this section, we shall give a formal presentation of a new subtyping relation 
and related subtyping rules. The coherence and admissibility of substitution and 
transitivity will also be proved. 



4.1 A New Subtyping Relation 

We have seen the problem with the combination of the component-wise subtyp- 
ing rules and the subtyping rule of the first projection. Now, we introduce a new 
relation to solve this problem and, consider a new system which is an 

extension of coercive subtyping with the judgement form: 

• r \- A B : Type asserts that type A is a subtype of type B with c. 

As we will see later, subtyping relation < and -< are different. -< represents the 
idea that 7Ti is regarded as a coercion, but < doesn’t. 

The coercive definition rules. The main idea of coercive subtyping can in- 
formally be represented by the following coercive definition rule (contexts are 
omitted) : 

K < c K' k:K / : ( x: K')K" 
f(k) = f(c(k)) : [c{k)/x\K" 

The same idea is for the new subtyping relation. A new basic subkinding rule 
for -< is the following: 

A< c B : Type 
El(A ) < c El(B) 

By the coercive definition rule, we have the following derivable rule: 

A^ C B: Type k : El{A ) f : (x : El{B))K 
f(k) = f(c(k)) : [c{k)/x\K 

which says that if A ~< c B , any functional operator / with domain B can be 
applied to any object a; of A and, fix) = f(c(x)). 

We present the new subtyping system in two stages: first an intermediate 
system T[1Ztti] 0 and the definition of coherence, and then the system T[R.iri]. 



4.2 The System T[JZ TrjJo and T[ 

Formally, T[lZiri]o is an extension of type theory T (only) with the following 
rules: 

• A set 1Z of basic subtyping rules whose conclusions are subtyping judgements 
of the form T b A < c B : Type. 
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• The following congruence rule for subtyping judgements 



{Cong) 



r b A < c B : Type 

T \- A = A' : Type T b B = B’ : Type T b c = c' : {A)B 
T \- A' < c f B' : Type 



• The new subtyping rules for the first projection in Figure 1, whose conclu- 
sions are of the form T b A -< c B : Type. 



Notation: we shall use T \- A oc c B : Type to represent T b A < c B : 
Type or T b A -< c B : Type. For example, rl ~' 4oCc ^' ri,pe actually represents 
two rules ^<f^ e and ^A^B-.Type . H-A^B-.Type r / hA'« e f B' -.Type 

represents four rules. We shall also say that A is a subtype of B or there is a 
coercion c from A to B if T b A oc c B : Type. 



New subtyping rule for the first projection: 

F \~ A : Type T b B : ( A)Type 
T b E{A,B) -< ni (A,B) A : Type 



T b A oc c A' : Type T b B : ( A)Type 
T b E{A, B) <co-kPA,b) A' : Type 



New congurence rule: 



T b A -< c B : Type 

r b A = A! : Type T\~ B = B' \ Type T b c = c' : (A)B 
fbd' -< c / B' : Type 



Fig. 1. New subtyping rules for the first projection 



Remark 1. We have the following remarks. 

• The basic understanding of the new subtyping rules for the first projection 
is that £{A, B) is a subtype of A! if A = A! or A is a subtype of A! . 

• New substitution and transitivity rules for subtyping relations < and -< will 
be given later and, we will prove that all of them are admissible. We do not 
include them in T[7\bri]o. 

New subtyping rules for parameterised inductive types. Now, we give 
the component- wise subtyping rules for 17-types and the rules for 77-types in 
Figure 2 and 3 to demonstrate what the subtyping rules should be for the new 
subtyping relation. 
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First Component rule: 

T h A < c A' : Type T h B : (A')Type 
r b r( J 4 , Boc) < dl X'( J 4 ', B) : Type 

where di = [z : £(A, B o c)\pair(A ' , B, c(7ti(j4, B o c, z)), n2{A, B o c, z)) 

Second Component rule: 

F h B : (A)Type T b B' : ( A)Type T,x : Ah B(x) oc e [ x ] B'(x) : Type 
T b B(A, B) <d 2 £{A, B 1 ) : Type 

where cfo = [z : £(A, B)\pair(A , B', 7Ti(v 4, B, 2), e[7ri( J 4, B, «)](7r2(-A, B, 2))) 

First-Second Component rule: 

T b A < c A' : Type T b 5 : (A)Type T b B’ : ( A’)Type 
T,x : A b B(*) oc e [ x] B'(c(®)) : Type 
T b 27(bl, B) <d 3 S') : Type 

where ^3 = [2 : B(bl, B)]pair(. 4 ', B', c(7Ti(j4, B, 2)), e[7ri(bl, B, z)](-K2(A, B, 2))) 
Fig. 2. New component-wise subtyping rules for B-types 



Domain rule: 

T b A' oc c A : Type T b B : ( A)Type 
T b n(A, B) < dl n(A', B o c) : Type 

where di = [f : II(A, B)]\(A' , B o c, app(A, B, f ) o c) 

Codomain rule: 

T b B : ( A)Type T b B' : ( A)Type T,x : Ah B(x) oc e [ x ] B'(x) : Type 
YVll^W^ll^^W)ABype 

where c?2 = [/ : II (A, B)]\(A, B', [x : A]e[x](app(A, B, f, x))) 

Domain-Codomain rule: 

T h A' oc c A : Type T h B : ( A)Type T h B' : ( A')Type 
T,x' : A' h B(c(x')) oc e [a./] B'(x') : Type 
T b II(A, B) < d3 II(A', B') : Type 

where (fe = [/ : II(A , B)]X(A' , S', [a/ : A']e[x'](app(A , B, f, c(x')))) 



Fig. 3 . New subtyping rules for B- types 
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Remark 2. We have the following remarks. 

• In Figure 2 and 3, the conclusions of the rules are always of the form r b 
A < c B : Type, no matter the premises are of the form T h A < c B : Type 
or T h A ~< c B : Type. 

• The essence of the new subtyping relation is that, the judgement form T h 
A -< c B : Type is never used in the premises of the first component of the 
component- wise subtyping rules in Figure 2. And hence the second derivation 
of the counter example in section 3 is blocked. 

• The basic understanding of the new subtyping rules for 77-types is that 
II(A,B) is a subtype of n(A',B') if A! is a subtype of A and B is a sub- 
family of B' (we omit other cases such as: 77 (A, B) is a subtype of 77(A, B') 
if B is a sub- family of B'). 

• For the new component-wise subtyping rules for 77-types, because of the 
incoherence when 7Ti is also regarded as a coercion, we need to have a stricter 
understanding, that is, 77(A, 73) is a subtype of E(A',B') if A is a subtype 
of A! and B is a sub-family of B' and the sizes of A and A! are the same 
( size is defined in the definition 4). In the following section, we will prove 
that the sizes of A and B are the same if T b A < c B : Type and, the size 
of A is bigger than the size of B if T b A -< c B : Type. 

The subtyping system we presented here covers all the coercions derived from the 
component-wise subtyping rules and the subtyping rule for the first projection 
when they are used separately. Actually, it has more coercions. For example, if 
A, B and C are different types, we can have a coercion from A x (7? x C) to 
A x B because there is a coercion from B x C to B. But we can never derive 
a coercion from A x (77 x C) to A x B by the component-wise subtyping rules 
or the subtyping rule for the first projection separately. What we have excluded 
are those coercions that need component-wise subtyping rules for 77-types but 
the sizes of their first components are different. For example, we don’t have a 
coercion from (A x B) x C to A x C because the sizes of A x 73 and A are different 
although there is a coercion from A x 73 to A. 

In T[72.7 Ti]o, the subtyping judgements do not contribute to any derivation 
of a judgement of any other forms in the original type theory T. Therefore, we 
have the following lemma. 

Lemma 1. T[7?.7 Ti]o « conservative extension ofT. 

Remark 3. As the two subtyping relations < and -< do contribute to each other, 
T[TZtti\ 0 is not a conservative extension of T[1Z\ 0 whose subtyping judgements 
are only of the form T b A < c 73 : Type (see [17] for details). 

Now, we define the most basic requirement for the new subtyping relation in the 
following. 

Definition 2. (Coherence condition of T[1Ztti\o) We say that T[7\bri]o is 

coherent if it has the following properties. 




Combining Incoherent Coercions for Z'-Types 285 



1. r b A oc c B : Type implies T b A: Type, T b B : Type, and The: ( A)B . 

2. T b A oc c B : Type implies T \f A = B : Type. 

3. T b A < c B : Type and T \- A < c / B : Type imply T b c = d : ( A)B . 

f. T b A -< c B : Type and T \- A -< c i B : Type imply T b c = d : {A)B. 

5. (Disjointness) T b A < c B : Type implies T \f A -< c > B : Type for any c' , 
and vice versa, T b A -< c B : Type implies T \f A < c > B : Type for any c' . 



Remark f. One may consider a more general coherence condition like, if T b 
A oc c B : Type and T h A oc c ' B : Type then T h c = d : ( A)B . This will 
include the case which both T b A < c B : Type and T b A -< c B : Type may 
happen. However, one of the reasons we need the new subtyping relation (^;) is 
deliberately to make sure that T b A < c B : Type and T b A -< c B : Type may 
never hold at the same time for any A and B. Disjointness is regarded as a part 
of coherence condition. 



The system T[lZiri]. The system T[7?.7ri] is an extension of T[7?.7ri]o with the 
inference rules in Appendix. Comparing with the original subkinding rules in 
[17], a new rule is added. 



( New Basic Subkinding Rule ) 



T b A -< c B : Type 
T b El{A) < c El(B ) 



There is only one subkinding judgement form T b K < c K' , although there are 
two subtyping judgement forms T b A < c B : Type and T \- A ^, c B : Type. At 
the kind level, we are more concerned with the existence of a coercion no matter 
it is derived from which form at the type level. 



Remark 5. The main result of [21] is essentially that coherence of subtyping rules 
does imply conservativity. In the next section, we shall also prove the coherence 
of T[Rni] 0 . So, T[TZtti] is also expected to be a conservative extension of T. 



4.3 Coherence of T[T?.7ri]o 

Now, we prove the coherence of T[7?.7ri]o, which essentially says that coercions 
between any two types must be unique. In this paper, the set 1Z of basic sub- 
typing consists of the rule ( WDCrule ) and the new subtyping rules for li-types 
and 77-types (in Figure 2 and 3) and, the system T[7^7ri] 0 also includes the con- 
gruence rule (Cong) and the new subtyping rules in Figure 1. Furthermore, we 
assume that for any judgement T \- A < c B : Type € C , neither A nor B is com- 
putationally equal to a A-type or 77-type. We also assume that the original type 
theory T has good properties, in particular the properties of Church-Rosser and 
Strong Normalisation and the property of context replacement by equal kinds. 

We give a definition of size(A) that only counts how many times that Wi can 
be applied for an object of type A. In order to define size, we define presize 
first. 
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Definition 3. (presize) Let r b M : Type be a derivable judgement in 
T(R,tti]o and M a normal form (i.e. M = nf(M)), 

1. if M is not a E-type then presize{M) =df 0, 

2. if M = E(A, B) then presize(M) =df presize(A) + 1. 



Remark 6. For the second case, because M is a normal form, so is A. Therefore 
presize is well-defined. 



Definition 4. (size) The definition of size in T[7^7ri] 0 .' Let T b M : Type be 
a derivable judgement in T\Rt:i]o, size(M) =df presize(nf(M)). 

Remark 1. T[7?.7ri]o is a conservative extension of T and every well-typed term 
in T has its unique normal form. So, the value of size(M) is unique and size is 
well-defined. 



Lemma 2. In T[7\bri]o, if T b M\ = M 2 '■ Type then size(AIi) = si 2 :e(M 2 ). 

Proof. T\TZtti\q is a conservative extension of T and T has properties of Clrurch- 
Rosser and strong normalisation, i.e. nf(Mi) = n/(M 2 ). 

Lemma 3. Let T b M : Type be a derivable judgement in T\R,tii]q. 

• if M is not computationally equal to a E-type then size(M) = 0 and, 

• if T b M = E(A, B) : Type then size(M) = size(A) + 1. 

Proof. By the definition of size and Lemma 2. 



Lemma 4. In T(R,tti]o, if T b Mi <d M 2 ■ Type then size(Mi) = si 2 e(M 2 ). 

Proof. By induction on derivations and Lemma 2 and Lemma 3. Note that 
size(Mi) = size{M 2 ) = 0 if the last rule of T b Mi < d M 2 : Type is one 
of the rules for 77-types. 



Lemma 5. In T[R,t:i](i, if T b Mi -< c M 2 : Type then size(Mi) > size(M 2 ). 
Proof. By induction on derivations and Lemma 2, Lemma 3 and Lemma 4. 

The following theorems prove the coherence of T[7?.7ri]o. 

Theorem 1. • If T b Mi <x c M 2 ■ Type then T b Mi : Type, T b M 2 ■ Type 

and T b c : (Mi)M 2 : Type. 

• If T b Mi oc c M 2 : Type then T Mi = M 2 : Type. 

• 1/7 b Mi -< c M 2 : Type then r \f Mi <d M 2 : Type for any d. And vice 
versa, i/7b Mi < c M 2 : Type, then T \f Mi -<d M 2 : Type for any d. 
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Proof. By induction on derivations, the definition of WDC, Lemma 4 and Lemma 

5. 

Theorem 2. If b P = r' r b M\ = M[ : Type and T b M 2 = M 2 : Type and 

1. T b Mi <d M 2 ■ Type and T' b M[ <d' M 2 : Type, or 

2. T b Mi -<d M 2 : Type and T' b M[ -<d' M 2 : Type 

then T b d = d! : (. M{)M 3 . 

Proof. By induction on derivations. A most important arguement in this proof is 
that, any derivations of r b Mi <d M 2 and P' b M[ <d M 2 , or T b Mi -<d M 2 

and r' b M[ -<& M 2 must contain sub-derivations whose last rules are the same 

rule, followed by applications of the congruence rules. 

4.4 Admissibility of Substitution and Transitivity 

Now, we give the subtyping rules of substitution and transitivity and, prove 
that these rules are admissible. In an implementation of coercive subtyping, 
these rules are ignored simply because they cannot be directly implemented. 
For this reason among others, proving the admissibility of such rules (or their 
elimination) is always an important task for any subtyping system. 

Admissible substitution rules. The substitution rules are as follows, which 
are what we expect normally. 

r, x : K, r b A < c B : Type T h k : K 
T , [k/x]T' b [k/x]A <[ k / x ] c [k/x)B : Type 

r, x : K, T' \- A ~< c B : Type T b k : K 
T, [k/x]r' b [k/x]A ~<[ k / x ] c [k/x)B : Type 

Admissible transitivity rules. We give the following four transitivity rules 
that are basically saying that if there are coercions c and d from type A to B 
and from type B to C, then c' o c is a coercion from type A to C . 

r \- A < C1 B : Type r b B < C2 C : Type T \- A -< C1 B : Type T \- B -< C2 C : Type 
T \~ A < C2 oci C : Type T \- A ~^C 2 OCl C : Type 

T \- A < Cl B : Type T b B -< C2 C : Type T \- A -! Cl B : Type T \- B < C2 C : Type 
r \~ A -<C2 OCi C : Type T b A -< C2OC1 C Type 

Remark 8. The above transitivity rule are sufficient and correct, in the sense 
that, first, they capture the meaning of transitivity, and second, they enjoy the 
properties in the lemmas 4 and 5 . Other rules of different combination such as 
the rule 

r \- A < Cl B : Type T b B < C2 C : Type 
r b A -< C2 oci C : Type 

are not correct and contradictory to the above properties. 
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Theorem 3. (Substitution in T[TZtti]q) If T b k : K and 

1. if r,x : K,r' b Mi < c M 2 : Type, then r,[k/x\r' b [k/x\M\ <[k/ x \c 
[k/x\M 2 : Type, and 

2. if r,x : K, r' b Mi -< c M 2 : Type, then T,[k/x\r' b [k/x\Mi <[k/x\ c 
[k/x\M 2 : Type. 

Proof. By induction on derivations. 

In order to prove the admissibility of the transitivity rules, we also need to 
prove the theorem about weakening. 

Theorem 4. (Weakening in T[1Ztti\o) If T C P' , P' is valid and 

1. if r b Mi < c M 2 ■ Type then P' b Mi < c M 2 : Type, and 

2. if r b Mi -< c M 2 : Type then T' b M\ -< c M 2 : Type. 

Proof. By induction on derivations. 

To prove the admissibility of transitivity rules, the usual methods ( e.g . by in- 
duction on derivations) do not seem to work. We develop a new measure (Depth) 
that is an adoption of the measure (depth) developed by Chen, Aspinall and 
Companoni [9]. In the measure Depth , the subtyping judgements (< and -<) 
only count. 



Definition 5. (Depth) Let D he a derivation of a subtyping judgement, of the 
form r b A < c B : Type or T b A -< c B : Type. 

Si - S n Ti ... T m 
r \- A oc c B : Type 

where T b A oc c B : Type represents T b A < c B : Type or r b A -< c B : Type, 
Si,...,S n are derivations of subtyping judgements of the form T b Mi <d M 2 : 
Type or r b Mi -<d M 2 : Type and, Ti,..., T m are derivations of other forms of 
judgements, 



Depth(D) =df 1 + max{Depth(S 1 ), ..., Depth(S n )} 

Specially, if n = 0 then Depth(D) =df 1. 

The following lemmas show that, from a derivation D of a subtyping judgement 
J one can always get a derivation D' of the judgement obtained from J by 
context replacement such that D and D' have the same depth. 

Lemma 6. If b T = T' and 

1. if D is a derivation of T b Mi <d M 2 ■ Type, then there is a derivation D' 
of r' b Mi <d M 2 : Type such that Depth(D) = Depth(D'), or 

2. if D is a derivation of T b Mi -<d M 2 : Type, then there is a derivation D' 
of r' b M\ -<d M 2 : Type such that. Depth(D) = Depth(D') . 
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Proof. By induction on derivations. 

Lemma 7. If r,x : K,r' b M\ < Cl M 2 : Type G C and T b c 2 : (K’)K then 
P,y ■ K', [c 2 (y)/x\r' b [c 2 (y)/x]Mi <[c 3 (y)/x\c 1 [ci{y)/x}M 2 : Type G C. 

Proof. By the weakening and substitution in the definition of WDC. 



Lemma 8. If T b c 2 : ( K')K and, 

1. if D is a derivation of T,x : K,T' b M\ < Cl M 2 : Type, then there 
is a derivation D’ of T,y : IC ,[c 2 (y) /x\T’ b [c 2 (y)/x]Mi <[ C2 ( y )/x] cl 
[c 2 {y)/x\M 2 : Type such that Depth(D) = DepthfD') , or 

2. if D is a derivation of T,x : K,T' b M\ < Cl M 2 : Type, then there 
is a derivation D' of T,y : K' ,[c 2 (y) / x\T' b [c 2 (y)/a;]Mi <\ C2 {y)/x\ci 
[c 2 (y)/x\M 2 : Type such that Depth(D) = Depth(D’) . 

Proof. By induction on derivations and Lemma 7. The theorem of weakening 
and substitution in type theory T and the property of conservativity of T[7?.7ri]o 
over T are also needed in this proof. 

Now, we can prove the admissibility of transitivity rules. 

Theorem 5. (Transitivity in T[1 Ztti)q) If T b M 2 = M' 2 : Type and 

1. if r b M\ <d x M 2 : Type and T b M 2 <d 2 M'i '■ Type, then 
r b Mi < d2 o d i M 3 : Type, and 

2. r b Mi -<d ! M 2 ■ Type and T b M 2 -<d 2 M 3 : Type, then 
T b Mi -<d 2 od 1 M 3 : Type. 

3. if r b Mi <d 1 M 2 : Type and T b M 2 -<d 2 M 3 : Type, then 
r b Mi -<d 2 od 1 M 3 : Type, and 

4- r b Mi -<d ! M 2 : Type and T b M 2 <d 2 M 3 : Type, then 
T b Mi -<d 2 od r M 3 : Type, and 

Proof. By induction on DepthfD) + Depth(D'), where I? is a derivation of T b 
Mi < dl M 2 : Type or r b Mi -< dl M 2 : Type, D' is a derivation of T b M 2 < d2 
M 3 : Type or T b M 2 -< d2 M 3 : Type. 



5 Discussions 



Side conditions. 3 In order to block the unwanted derivations, one may still 
try to keep the rule 7Ti rule in section 3 and use side conditions for the First 
Component rule, without introducing any new subtyping relation. For instance, 
one of such side conditions for the First Component rule is the following. 



T b A < c A' : Type T h B : (A')Type 
r b E {A, Boc) < dl Z(A', B) : Type 



( size{A ) = size{A')) 



3 Thanks to an anonymous referee for the comments on this issue. 
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or 



r b A < c A' : Type T b B : (A’)Type 
T h S(A, Boc) < dl E(A>, B) : Type 



(size(A) size(A')) 



In T[7?.7Ti]o, size is well-defined. Similarly, size can be defined in T[TZ]o and one 
can prove its well-definedness (see [17,13] for more details of T[ 1Z\ Q and T[R\. 
Here, 1Z includes one of the above rules). It is obvious that T[R.iri]o and T[R]o 
are equivalent in terms of the following lemma. 



Lemma 9. If T b A oc c B : Type is derivable in T[lZiri\ 0 then T b A < c B : 
Type is derivable in T[1Z \ 0 and vice versa. 

However, since the system T[1Z] includes the Coercive definition rule and the 
Coercive application rules in Appendix, A and A! in the side-condition may not 
be well- typed in the original type theory any more. The way to compute such 
terms is to insert coercions first and then do usual computation in the original 
type theory. So the property that inserting coercion is decidable in T[1Z] must be 
proved first in order to argue the well-definedness of size. There is a circularity, 
that is, a property of T[1Z] is needed in order to present T[1Z\ itself. 

Algorithm and decidability. Since we proved the coherence and admissibil- 
ity of substitution and transitivity, the coercion searching for whole system is 
decidable if it is decidable for C . In other words, there is an algorithm to check 
whether there exists a coercion between any two types. We omit the details here. 
Further study. In this paper, we had a case study about how to combine 
incoherent coercions. The methods developed here may have a wider application. 
In general, it is also natural to consider new subtyping relations to block those 
derivations which make the coercive subtyping system incoherent. The method 
to introduce new transitivity rules may guide a further study of a system in 
which there are more than one subtyping relations. 

The subtyping rules for parameterised inductive types need further study. 
For example, we introduce subtyping rules for lists as follows. 

T b A oc c B : Type 
T b List(A) <d List(B) : Type 

where d = map(A, B, c) such that d(nil(A)) = nil(B) and d(cons(A,a,l)) = 
cons(B , c(a),d(l)). 

As studied in [14], if we add this rule in the system, the transitivity rules 
would not be admissible. In a forthcoming paper, we will study new computation 
rules for parameterised inductive types and such rules will make, for example, 
map(B,C,c') omap(A,B,c) and map(A,C,c' o c) computationally equal. And 
hence the above subtyping rules for lists enjoy the property of admissibility of 
transitivity. 

Related work. The early development of the framework of coercive subtyping 
is closely related to Aczel’s idea in type-checking overloading methods for classes 
[1] and the work on giving coercion semantics to lambda calculi with subtyping 
by Breazu-Tannen et al [6]. Bartlre and his colleagues have studied constructor 
subtyping and its possible applications in proof systems [4,5]. A recent logical 
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study of subtyping in system F can be found in [12] and Chen has studied the 
issue of transitivity elimination in that framework [9]. 

Acknowlegements. We would like to thank the member of the Computer- 
Assisted Reasoning Group at Durham for discussions and the TYPES03 referees 
for the comments on the paper. 
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Appendix 

The following are the inference rules for the coercive subkinding extension 
T[1Ztti\ (not including the rules for subtyping) 

Basic subkinding rule 

r \- A < c B : Type T b A -< c B : Type 

T b El(A) < c El(B) T b El{A) < c El{B) 

Coercive application rules 

r b / : (ar : K)K' T b k 0 : K 0 T b K 0 < c K 
r b f(k 0 ) : [c(k 0 )/x\K' 

T b / = /' : {x : K)K' T b k 0 = k' 0 : K 0 T b K 0 < c K 
r b f(ko) = f'(K) : [c(k 0 )/x]K' 

Coercive definition rule 

rb/:(i: K)K' T b k 0 : K 0 T b K 0 < c K 
r b f(ko) = f(c(k 0 )) : [c(ko)/x\K' 

Subkinding for dependent product kinds 

F b K[ = K 1 T, x' : K\ b K 2 <c K' 2 T, x : K x b K 2 kind 
T {x : K])K 2 <[f:(x:K 1 )K2][x':K[]c(f(x 1 )) ’ ^l)^2 

T b K[ < c A'i r, x' : K[ b [c(a; , )/a:]A '2 = K 2 T, x:R\\- K 2 kind 
T\~ (X : K])K 2 <[f : (x:Ki)K2\[x':Ki\f(c(x')) ( X> : K l) K 2 
T b K[ < C1 A'i T,x' : K[ b [ Cl (x')/x\K 2 < C2 K! 2 T,x : A'i b K 2 kind 
T b (x : K\)K 2 <[f:(x:K 1 )K 2 \[x':K , 1 \c2(f(ci(x'))) ( X> ’ ^l)-^2 
Congruence rules for subkinding 

T b All < c K 2 T b A'i = K[ TP K 2 = K’ 2 r b c = c' : (K)K' 

A b K\ < c , I\ 2 

Transitivity and Substitution rules for subkinding 

T b K < c K’ T b K' < c / A"" 

A b AT < c / oc K" 



A, x : A', A' b A'i < c K 2 Abb A' 
A, [k/x\T' b [fc/ x]A'i <[ fc / x ] c [k/x\K 2 
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Abstract. Proof search has been used to specify a wide range of computation 
systems. In order to build a framework for reasoning about such specifications, 
we make use of a sequent calculus involving induction and co-induction. These 
proof principles are based on a proof theoretic notion of definition, following on 
work by Schroeder-Heister, Girard, and McDowell and Miller. Definitions are es- 
sentially stratified logic programs. The left and right rules for defined atoms treat 
the definitions as defining fixed points. The use of definitions also makes it possi- 
ble to reason intensionally about syntax, in particular enforcing free equality via 
unification. The full system thus allows inductive and co-inductive proofs involv- 
ing higher-order abstract syntax. We extend earlier work by allowing induction 
and co-induction on general definitions and show that cut-elimination holds for 
this extension. We present some examples involving lists and simulation in the 
lazy A-calculus. Two prototype implementations are available: one via the Hybrid 
system implemented on top of Isabelle/HOL and the other in the BLinc system 
implemented on top of AProlog. 



1 Introduction 

A common approach to specifying computation systems is via deductive systems, e.g., 
structural operational semantics. Such specifications can be represented as logical theo- 
ries in a suitably expressive formal logic in which proof-search can then be used to model 
the computation. This use of logic as a specification language is along the line of logical 
frameworks [22]. The representation of the syntax of computation systems inside formal 
logic can benefit from the use of higher-order abstract syntax (HOAS), a high-level and 
declarative treatment of object-level bound variables and substitution. At the same time, 
we want to use such a logic in order to reason over the meta-theoretical properties of ob- 
ject languages, for example type preservation in operational semantics [15], soundness 
and completeness of compilation [19] or congruence of bisimulation in transition sys- 
tems [16]. Typically this involves reasoning by ( structural) induction and, when dealing 
with infinite behaviour, co-induction [5]. 

The need to support both inductive and co-inductive reasoning and some form of 
HOAS requires some careful design decisions, since the two are prima facie notoriously 
incompatible. While any meta-language based on a A-calculus can be used to specify and 
possibly perform computations over HOAS encodings, meta-reasoning has traditionally 
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involved (co)inductive specifications both at the level of the syntax and of the judge- 
ments as well (which are of course unified at the type-theoretic level). The first provides 
crucial freeness properties for datatypes constructors, while the second offers principle 
of case analysis and (co)induction. This is well-known to be problematic, since HOAS 
specifications lead to non-monotone (co)inductive definitions, which by cardinality and 
consistency reasons are not permitted in inductive logical frameworks. Moreover, even 
when HOAS is weakened so as to be made compatible with standard proof assistants [7] 
such as HOL or Coq, the latter tend to be still too strong , in sense of allowing the exis- 
tence of too many functions and yielding the so called exotic terms. This causes a loss 
of adequacy in HOAS specifications, which is one of the pillar of formal verification. 
On the other hand, logics such as LF [ 1 1] that are weak by design in order to support 
this style of syntax are not directly endowed with (co)induction principles. 

The contribution of this paper lies in the design of a new logic, called Line (for a logic 
with A-terms, induction and co-induction), that carefully adds principles of induction 
and co-induction to a higher-order intuitionistic logic based on a proof theoretic notion 
of definition, following on work (among others) by Schroeder-Heister [27], Girard [10] 
and McDowell and Miller [14]. Definitions are akin to logic programs, but allow to 
view theories as “closed” or defining fixed points. This alone allows us to perform case 
analysis. Our approach to formalizing induction and co-induction is via the least and 
greatest solutions of the fixed point equations specified by the definitions. Such least 
and greatest solutions are guaranteed to exist by a stratification condition on definitions 
(which basically ensures monotonicity). The proof rules for induction and co-induction 
makes use of the notion of pre-fixed points and post-fixed points respectively. In the 
inductive case, this corresponds to the induction invariant, while in the co-inductive one 
to the so-called simulation. 

The simply typed language underlying Line and the notion of definition make it 
possible to reason intensionally about syntax, in particular enforcing free equality via 
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B,r 



V72 



3C 



r 



A77 



c 



r — y bvc 
r — y Bt 



3x.B x 



vn 



377 



C 



C -^yC 



init 



r — y BDC 

Bn Bl, , B n , / Y C 



A u ...,A n ,r 



C 



D 77 



me, where n > 0 



Fig. 1. Inference rules for the core Line 
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unification, which can be used on first-order terms or higher-order A-terms. In fact, 
we can support HOAS encodings of constructors without requiring them to belong to a 
datatype. In particular we can prove the freeness properties of those constructors, namely 
injectivity, distinctness and case exhaustion. Judgements are encoded as definitions ac- 
cordingly to their informal semantics, either inductive, co-inductive or regular, i.e. true 
in every fixed point. Given the stratification condition, we (currently) fall short of the 
LF-like idea of Full HOAS, although, exploiting the equivalence with the completion of 
a logic program [26], the monotonicity requirement can be weakened beyond the scope 
of current induction-based proof-assistants. 

Line can be proved to be a conservative extension of FOX AN [14] and a generaliza- 
tion to the higher-order case of Martin-Lof [13] first-order theory of iterated inductive 
definitions. Moreover, at the best of our knowledge, it is the first sequent calculus with a 
cut-elimination theorem for co-inductive definitions. Further, its modular design makes 
its extension easy, for example in the direction of FO A v [18] or the regular world 
assumption [28]. 

The rest of the paper is organized as follows. Section 2 introduces the proof system 
for the logic Line. Section 3 shows some examples of using induction and co-induction to 
prove several properties of list-related predicates and the lazy A-calculus. Section 4 gives 
an overview of the cut-elimination procedure, the detailed proof of which is available 
in [31]. Section 5 surveys the related work and Section 6 concludes this paper. 



2 The Logic Line 

The logic Line shares the core fragment with FOX A]N , which is an intuitionistic version 
of Church’s Simple Theory of Types. Formulae in the logic are built from predicate 
symbols and the usual logical connectives _L, T, A, V, D, V T and 3 r . Following Church, 
formulae will be given type o. The quantification type r can have higher types, but 
those are restricted to not contain o. Thus the logic has a first-order proof theory but 
allows for the encoding of higher-order abstract syntax. The core fragment of the logic 
is presented in the sequent calculus in Figure 1 . A sequent is denoted by F — > C where 
C is a formula and F is a multiset of formulae. Notice that in the presentation of the rule 
schemes, we make use of HOAS, e.g., in the application A? a; it is implicit that B has no 
free occurrence of x. In the V7 Z and 3C rules, y is an eigenvariable that is not free in the 
lower sequent of the rule. Whenever we write down a sequent, it is assumed implicitly 
that the formulae are well-typed and in f3p- long normal forms: the type context, i.e., 
the types of the constants and the eigenvariables used in the sequent, is left implicit as 
well. The me rule is a generalization of the cut rule that simplifies the presentation of 
the cut-elimination proof. 

2.1 A Proof- Theoretic Notion of Definitions 

We extend the core logic in Figure 1 by allowing the introduction of non-logical constants. 
An atomic formula, i.e,, a formula that contains no occurrences of logical constants, can 
be defined in terms of other logical or non-logical constants. Its left and right rules are, 
roughly speaking, carried out by replacing the formula corresponding to its definition 




296 



A. Momigliano and A. Tiu 



with the atom itself. A defined atom can thus be seen as a generalized connective, whose 
behaviour is determined by its defining clauses. The syntax of definition clauses used by 
McDowell and Miller [14] resembles that of logic programs, that is, a definition clause 
consists of a head and a body, with the usual pattern matching in the head; for example, 
the predicate nat for natural numbers is written {nat z = T, nat s x = nat at}. We adopt 
here a simpler presentation by putting all pattern matching in the body and combining 
multiple clauses with the same head in one clause with disjunctive body. Of course, this 
will require us to have explicit equality as part of our syntax. The corresponding nat 
predicate in our syntax will be written 

nat x = [x = z] V By. [x = s y] A nat y 

and corresponds to the notion of iff-completion of a logic program. 

Definition 1. A definition clause is written \/x\px = B x], where p is a predicate con- 
stant. The atomic formula p x is called the head of the clause, and the formula B x 
is called the body. The symbol = is used simply to indicate a definition clause : it is 
not a logical connective. A definition is a (perhaps infinite) set of definition clauses. A 
predicate may occur only at most once in the heads of the clauses of a definition. 



We will generally omit the outer quantifiers in a definition clause to simplify the presen- 
tation. 

Not all definition clauses are admitted in our logic, e.g., definitions with circular call- 
ing through implications (negations) must be avoided as it can lead to inconsistency [25]. 
The notion of level of a formula allows to define a proper stratification on definitions. 
To each predicate p we associate a natural number lvl(p), the level of p. The notion of 
level is extended to formulae and sequents. 

Definition 2. Given a formula B, its level Ivl ( B) is defined as follows: 

1. lvl(pt) = lvl(p), lvl(-L) = lvl(T) = 0 

2. Ivl (B A C)= lvl (B V C) = max(lvl(f?), lvl(C)) 

3. lvl (B D C) = max(lvl(B) + l,lvl(C)) 

4. lvl (fi/x.B x) = lvl(3x.B x) = lvl(f? t), for any term t. 

The level of a sequent r — > C is the level of C. A definition clause \/x[px = B x\ is 
stratified if\v\{Bx) < lvl(p). A definition is stratified if all its definition clauses are 
stratified. 

An occurrence of a formula A in a formula C is strictly positive if that particular 
occurrence of A is not to the left of any implication in C. The stratification of definitions 
above implies that for every definition clause all occurrences of the head in the body are 
strictly positive. 

Given a definition clause px = B x, the right and left rules for predicate p are 



Bt,T — > C 
pt,r — > C 



defC 



F V Bt - defR 

r — > pt 
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The rules for equality predicates makes use of substitutions. We assume the usual def- 
inition of capture-avoiding substitutions. We use 9, p, S and er to denote those and their 
application is written in post-fix notation, e.g., tO. The left and right rules for equality 
are as follows 



{r p — > Cp | sp =p v tp,p £ CSU(s,t)} 



s = t,r 



c 



eq C 



r 



t = t 



ecfB 



The substitution p in eq£ is called a unifier of s and t. The set CSU(s, t) is a complete 
set of unifiers, i.e., given any unifier 9\ of s and t, there is a unifier 62 £ CSU (s, t) such 
that 9\ = 62 o 7 , for some substitution 7 . In the first order case, a set containing just the 
most general unifier is a complete set of unifiers. In general, however, the complete set 
of unifiers may contain more than one unifier and therefore we specify a set of sequents 
as the premise of the eq£ rule, which is to say that each sequent in the set is a premise 
of the rule. Note that in applying eq£, eigenvariables can be instantiated as a result. 



2.2 Induction and Co-induction 



A definition px = B x can be seen as a fixed point equation saying that for every term 
t, pt if and only if Bt holds. Since our notion of definition requires strict positivity 
of occurrences of p in B, existence of fixed points is always guaranteed. Hence the 
provability of pt means that t is in a solution of the corresponding fixed point equation, 
although not necessarily in the least (or greatest) solution (see e.g., [ 10 ] for an example). 
Therefore we add extra rules that reflect the least and the greatest solutions, respectively. 
Since we are in a monotone setting, we can use the pre-fixed point and the post-fixed 
point as an approach to the least and greatest fixed points. In the following we assume, 
for simplicity of presentation, that predicates are not mutual-recursively defined. The 
more general case where mutual recursion is treated can be found in [31]. 

Let p x = B x be a definition clause and let S' be a term of the same type as p. The 
induction rules for p are 



(- Bx)[S/p] 



s x r, st 



c 



r,pt 



c 



ic 



r 



Bt 



r 



pt 



in 



The abstraction S is an invariant of the induction. The variables x are new eigenvariables. 
An informal reading of 1C is to consider S as denoting a set (i.e., i £ S iff S 1 holds), 
B as denoting a fixed point operator and S as a pre-fixed point of B, i.e., B\S/p\ C S. 
Notice that the right-rule for induction is defB. The co-induction rules are defined dually. 



Bt,T — > C 
pt,T — ■> C 



C1C 



r—^St S x — > (Bx)[S/p\ 



r 



■pt 



Cin, where lvl(S) < lvl(p) 



S can be seen as denoting a post-fixed point, i.e., S C B[S/p}. The C1C rule is the defC 
rule. The proviso in C1B, although mainly technical, is satisfied by every example we 
have examined, since it requires the given predicate to be used “monotonically” in the 
simulation. 
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To avoid inconsistency, some care must be taken in applying induction or co- 
induction in a proof. One obvious pitfall is when the fixed point equation corresponding 
to a definition clause has different least and greatest solutions. In such case, mixing 
induction and co-induction on the same definition clause can lead to inconsistency. For 
example, let p = p be a definition clause. Given the scheme of rules above without any 
further restriction, we can construct the following derivation 



T 



TR 



T 



T 



TR 

cm 



_L 



_L 



1C 



_L 



_L 



_L 



_L 



cut 



1C 

1C 



In the above derivation we use T and _L as the invariant and the simulation in the instance 
of C1TZ and 1C rules. This example suggests that we have to use a definition clause 
consistently through out the proof, either inductively or co-inductively, but not both. To 
avoid this problem, we introduce markings into a definition, whose role is to indicate 
which rules are applicable to the corresponding defined atoms. 

Definition 3. An extended definition is a stratified definition T> together with a label, that 
indicates whether the clause is either inductive, co-inductive, or regular. An inductive 
clause is written as px = B x, a co-inductive clause is written as px = Bx and a 
regular clause is written as px = B x. 

Since we shall only be concerned with extended definitions from now on, we shall 
refer those simply as definitions. The induction and co-induction rules need additional 
provisos. The 1C and 1R rules can be applied only to an inductively defined atom. 
Dually, the C1C and CI'R, rules can only be applied to a co-inductively defined atom. 
The defC and defR rules apply only to regular atoms. However, we can show that defC 
and defR, are derived rules for (co-)inductively defined atoms. 

Proposition 1. The defC and defR are admissible rules in the core Line system with 
the induction and/or the co-induction rules. 



Proof. We show here how to infer defC using core Line and induction rules. The other 
case with co-induction is dual. Let px = B x be the definition under consideration: 
defC can be inferred from 1C using the body B as the invariant. 

n 

B[B/p\ x — >Bx Bt,T — >C _ 

pt,r—>c xc 



We construct the derivation 77 by induction on the size of B, i.e., the number of logical 
constants in B. In the inductive cases, the derivation is constructed by applying the 
rules for the logical connectives in B, coordinated between left and right rules. Since p 
occurs strictly positively in B by stratification, the only non-trivial base case we need 
to consider is when we reach the sub-formula p t of B x in which case we just apply the 
1R rule 




Bt 



pt 



init 

1R 
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3 Examples 

We now give some examples, starting with some that make essential use of HOAS. 



3.1 Lazy A-Calculus 

We consider an untyped version of the pure A-calculus with lazy evaluation, following the 
usual HOAS style, i.e., object-level A-operator and application are encoded as constants 
lam : (tm —> tm ) —> tm and @ : tm —> tm — > tm, where tm is the syntactic category 
of object-level A-terms. The evaluation relation is encoded as the following inductive 
definition 

MjUV = (3 M'.[M = lam M'] A [M = N}) V 

(3M 1 3M 2 3P. [M = Mi @ M 2 ] A Mi )) lam P A (P M 2 ) 1) N) 

Notice that object-level substitution is realized via /3-reduction in the meta-logic. 

The notion of applicative simulation of A-expressions can be encoded as the (strati- 
fied) co-inductive definition 

sim RS = VT.i? JJ. lam T D 3U.S J) lam U A VP.sim {T P) ( U P ) . 



Given this encoding, we can prove the reflexivity property of simulation, i.e., \/s.sim s s. 
This is proved co-inductively by using the simulation XxXy.x = y. After applying \/ 1 Z 
and CI1Z, it remains to prove the sequents — > s = s, and 



x = y — > Mx\.x JJlamaq D ( 3 x 2 .yij.lamx 2 A \/x^.{x\ x 3) = (x 2 X3)) 



The first sequent is provable by an application of eqT Z rule. The second sequent is proved 
as follows. 



«JJ.lama;i — » 3 JJ. lam £1 * W 3 lJ.lam*i — V* 3 .(a;i X3) = (*1 *3) 

z JJ. lam x\ — > (z JJ-lama;i A Vx3.(xi *3) = (xi X3 )) 

37 ^- 

3lJ.lam*i — > ( 3*2 -3 JJ- lam *2 A Vx 3 .(xi *3) = (*2*3)) 



3jJ.lama;i — > (xi *3) = (x\ X3) 



eqR, 

vn 



A 1 Z 



eqC 

x = y, x JJ- lam xi — > (3x2,y JJ.lamir2 AVs3.(ii 33) = (*2*3)) 

' 

x = y — > x JJ-lam*! D (3*2 ,y JJ lam x 2 A V*3.(*i *3) = (*2 *3)) 

: = y — > V*i.* JJ-lam*i D (3*2-2/ JJ- lam *2 A V*3.(*i *3) = (*2 *3)) 



=> U 

vn 



The transitivity property is expressed as \/r\/s\/t.sim r s A sim s t D sim r t. Its 
proof involves co-induction on sim r t with the simulation XuXv. 3 iu.sim u w A sim w v, 
followed by case analyses (i.e., defC and eq£ rules) on sim r s and sim s t. The rest of 
the proof is basically a series of manipulations of logical connectives. 

We can also show the existence of divergent terms. Divergence is encoded as follows. 



divrg T = ( 3 T 1 3 T 2 .T = (Ti@T 2 ) A divrg Ti) V 

(3 Ti 3 T 2 .T = (Ti@T 2 ) A 3 E.Ti JJ. lam E A divrg (ET 2 )). 
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Let 17 be the term (lam x.(x @ x)) @ (lam x.(x @ x)). We show that divrg 17 holds. The 
proof is straightforward by co-induction using the simulation S := A s.s = 17. Applying 
the CI 1 Z produces the sequents — > f 2 = 17 and T = 17 — > S i V S 2 where 

S± := 3 Ti 3 T 2 .T = (Ti@T 2 ) A (STi), and 

S 2 := 3 T 1 3 T 2 .T= {T 1 @T 2 )A 3 E.T 1 tylamEAS(ET 2 ). 

Clearly, only the second disjunct is provable, i.e., by instantiating T\ and T 2 with the 
same term lama;. (2; @ a;), and E with the function Ax.(x @ x). 

3.2 Lists 

Lists over some fixed type a are encoded as the type 1 st, with the usual constructor 
nil : 1 st for empty list and :: of type a — > 1 st — > 1 st. We consider here the append 
predicate for both the finite and infinite case. 

Finite lists The usual append predicate on finite lists can be encoded as the inductive 
definition 

app Li L 2 L3 = ( Li = nil A L 2 = L3) V 

3 x 3 L' 1 3 L' z .Li = x : : L\ A L3 = x::L' 3 A app L[ L 2 L' 3 . 

Associativity of append is stated formally as 

V/iV/ 2 V/i 2 V?3VZ4.(app l\ l 2 l\ 2 A app l\ 2 l 3 I4) Z) Vl 2 3.app l 2 l 3 l 23 D app l\ l 2 3 I4. 

Proving this formula requires us to prove first that the definition of append is functional, 
that is, 

VliV? 2 VZ 3 V?4.app l\ l 2 l 3 A app l\ l 2 ^4^/3 — I4. 

This is done by induction on l\, i.e., we apply the IC rule on app l\ l 2 I3 , after the 
introduction rules for V and D, of course. The invariant in this case is 

S := A?’iAr 2 Ar3.Vr.app r\ r 2 r D r = r 3. 

It is a simple case analysis to check that this is the right invariant. Back to our original 
problem: after applying the introduction rules for the logical connectives in the formula, 
the problem of associativity is reduced to the following sequent 

app l 2 l\ 2 , app l\ 2 l 3 I4, app Z 2 l 3 l 2 3 — > app l\ l 2 3 I4. (1) 

We then proceed by induction on the list 1 1 , that is, we apply the IC rule to the hypothesis 
app 1 1 l 2 Z-| 2 . The invariant is simply 

S := A/iAZ 2 AZi 2 .VZ3VZ4.app li 2 I3 I4 D VZ 2 3.app l 2 I3 l 2 3 D app l\ l 2 3 I4. 

Applying the IC rule, followed by V£, to sequent ( 1 ) reduces the sequent to the following 
sub-goals 
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(i) S l\ I2 I12, app I12 I3 I4, app I2 h I23 — » app l\ I23 I4, 

(ii) (li = nil A l 2 = I3) — * S l\ I2 I3, 

(in) 3a: , = x::l[ A I3 = x::l' 3 A S l[ 1 ' 3 — > S l\ l 2 I3 

The proof for the second sequent is straightforward. The first sequent reduces to 

app I12 h h, app Z12 I3 I23 — > app nil l 2 3 h- 

This follows from the functionality of append and X 1 Z. The third sequent is basically 
done by a series of case analysis. Of course, these proofs could have been simplified 
by using a derived principle of structural induction. While this is easy to do, we have 
preferred here to use the primitive 1 C rule. 

Infinite lists The append predicate on infinite lists is defined via co-recursion, that is, we 
define the behaviour of destructor operations on lists (i.e,, taking the head and the tail 
of the list). In this case we never construct explicitly the result of appending two lists, 
rather the head and the tail of the resulting lists are computed as needed. The co-recursive 
append requires case analysis on all arguments. 

coapp L\ L 3 = ( L\ = nil A = nil A L 3 = nil) V 

(L\ = nil A = x :: L ' 2 A L3 = x :: L ' 3 A coapp nil L ' 2 L’fi) 

V {fix3L' l 3L'- i .Li = x::L( A L 3 = x::L ' 3 A coapp L\ L 2 L'fi). 

The corresponding associativity property is stated analogously to the inductive one and 
the main statement reduces to proving the sequent 

coapp l\ l 2 1 12, coapp I12 I3 h, coapp I3 I23 — > coapp l\ I23 h- 

We apply the CI 1 Z rule to coapp l\ 1 2 3 1 4, using the simulation 

S := A/iA/2A;i2.3?233/ 3 3/4.coapp I12 I3 I4 A coapp I2 I3 I23 A coapp l\ I23 h- 

Subsequent steps of the proof involve mainly case analysis on coapp /12 I3 h- As in the 
inductive case, we have to prove the sub-cases when l V2 is nil. However, unlike in the 
former case, case analyses on the arguments of coapp suffices. 



4 Cut-Elimination 



A central result of our work is cut-elimination, from which consistency of the logic 
follows. Gentzen’s classic proof of cut-elimination for first-order logic uses an induction 
on the size of the cut formula, i.e., the number of logical connectives in the formula. 
The cut-elimination procedure consists of a set of reduction rules that reduce a cut of a 
compound formula to cuts on its sub-formulae of smaller size. In the case of Line, the 
use of induction/co-induction complicates the reduction of cuts. Consider for example 
a cut involving the induction rules 



A — > Bt 



A 



pi- 



rn 



n B 

B[S/p\ y — 



n 

s y st,r—>c 



pt,r 



c 



ic 



A,r 



c 
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There are at least two problems in reducing this cut. First, any permutation upwards of 
the cut will necessarily involve a cut with S that can be of larger size than p, and hence a 
simple induction on the size of cut formula will not work. Second, the invariant S does 
not appear in the conclusion of the left premise of the cut. The latter means that we need 
to transform the left premise so that its end sequent will agree with the right premise. 
Any such transformation will most likely be global, and hence simple induction on the 
height of derivations will not work either. We define a proof transformation that we call 
unfolding to deal with the cut involving IC/lIZ and C1IZ/C1C pairs. 

In the following definition, we refer to a premise of a rule as a minor premise if it is 
the left-premise of D C or 1C, or the right-premise of CIV., or me, otherwise it is a major 
premise. A derivation of a minor [major] premise is a minor [major] premise derivation. 
To simplify the definitions of unfolding, we restrict the init- rule to the atomic form. 
Non-atomic init- rule can easily be shown to be derivable using only structural rules, 
logical rules and atomic init. We shall refer to this non-atomic init derivation as II . We 
use the notation 116 to denote the application of the substitution 6 to II, which amounts 
to applying the substitution to every sequent in II. 



Definition 4. Inductive unfolding. Let px = B x be an inductive definition. Suppose 
we are given a derivation II of T — > C where each occurrence of p in C is strictly 
positive, and a derivation II g ofB[S/p\ x — > S x, for some closed term S. We define the 
derivation p{II , TTg) of! — > C[S/p] as follows. IfC[S/p\ = C, then p(n, TTg) = II. 
Otherwise, we define p(II, II g) based on the last rule in II. 



1. If II ends with init on atom pt, then p{II, 77 g) is the derivation 



n s _n Id 

B[S/p] x — > Sx S t — > S t 
pt — > S t 



1C 



2. If 77 ends with D V 

77 ' v(n',n s ) 

F, Ci — > C '2 then p(II , II s) is r,Ci — ■> CfiS/p] ^ 

r — ► Ci d c 2 D K r — > Ci d c'2 [s/p] D 

Note that the restriction on the occurrence ofp in C implies that {C\ D Cf) [S/p] = 
Ci D C 2 [S/p}. 

3. If 77 ends with 1TZ on p u, for some terms u, 



77' s) n s [u/x) 

r — > Bu then p(II, 11 $ ) is r — > B[S/p] u B[S/p] u — > Su 

r — >p u ^ r-+Su mc 

4. Otherwise, if 77 ends with any other rule, with the minor premise derivations Si, . . ., 
Smfor some m > 0 and the major premise derivations {Ilfin^zfor some index set 
1, then p(II, n s ) ends with the same rule, with the same minor premises and the 
major premises {/^(77 i; 77s)} ie x- 
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Definitions. Co-inductive unfolding. Let px = B x be a co-inductive definition. Let 
C be a formula in which every occurrence ofp is strictly positive. Suppose we are given 
a derivation II of T — > C[S/p\ and a derivation II $ of S x — > B[S/p] x, for some 
closed term S. We define the derivation f(II, II $) ofT — > C as follows. IfC[S/p\ = C, 
then v(II, n s ) = II. If C = p tfor some terms t, then C[S/p\ t = St and o{II 1 II $) is 
the derivation 

n _ n s 
r > st s x - >■ b x cin 

1 — > pt 

Otherwise, we define v{ II , 11$) based on the last rule in II. If II ends with D IZ 

n' v{n',n$) 

r, Ci > c 2 [ s / p] ^ ^ then p,(n,n$) is r,Ci — > C 2 ^ 

r — mCl d c 2 [s/ P ] D D n - 

Ifll ends with any other rule, with the minor premise derivations . . ., S rn for some 

m > 0 and the major premise derivations {77, } j g x for some index set X, then p(II, 11$) 
also ends with the same rule, with the same minor premises and the major premises 

{p{n.i, n$)} iex - 

Our proof of cut-elimination uses the technique of reducibility originally due to Tait. 
The method was applied by Martin-Lof [13] to the setting of natural deduction, and 
to sequent calculus by McDowell and Miller for the logic F()\ A]S [14]. The original 
idea of Martin-Lof was to use derivations directly as a measure by defining a well- 
founded ordering on them. The basis for the latter relation is a set of reduction rules 
that are used to eliminate the applications of cut rule. For the cases involving logical 
connectives, the cut-reduction rules used to prove the cut-elimination for Line are the 
same to those of FOX A1N . The crucial cases involving (co)-induction are given in the 
following definition. For simplicity of presentation, we assume the reduction involves 
the leftmost and the rightmost premise derivations of me. 



Definition 6. Cut-reduction. Let “ be the derivation 



n i 

Ai — > £>t 



n n 

A n — > D n Di,...,D n ,r 



n 

■ i D n , 



C 



me. 



A u ..,,A n ,r — > 

Case * jXC\ If D\ = pt, where px = B x, and II ends with XL. on the cut formula p t 



n$ _ n' 

B[S/p] x — > S x s t, d 2 , ■ ■ ■ , D n , r — > c ^ ^ 

pt,D 2 ,...,D n ,r — > c xc 

then X reduces to 

K n i,n$)_ n 2 n n _ n' 

A x —t Si A 2 ^D 2 ■■■ A n — > D n Si,D 2 ,...,D n ,r—>C 
A x ,...,A n ,r — >c 
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Case CI1Z/CTC'. If D\ = pt, where px = B x, and 77i and 77 are 






St Si 



n s 

A B[S/p] . 



A 1 — > pt 
then E reduces to 

n\ e 1 

A i — >St St — > Bt 



cm 



rr 

Bt, D2, . . . , D n , r — >c 
pt,D 2 ,...,D n ,r — > c 



C1C 



A 1 — > Bt 



n n _ n' 

A n > D n B t, . . . , D n , r > C 



A u ...,A n ,r — \C 



me 



where ai = u(IIs, IIs)[t/x\. 

Notice that these two reductions are not symmetric. This is because we use an asymmetric 
measure to show the termination of cut-reduction, that is, the complexity of cut is always 
reduced on the right premise. The difficulty in getting a symmetric measure, in the 
presence of contraction and implication (in the body of definition), is already observed 
in [25]. 

To show the termination of cut-reduction, we define two orderings on derivations: 
normalizability and reducibility (called computability in [13]). The well-foundedness 
of the normalizability ordering immediately implies that the cut-elimination process 
terminates. Reducibility is a superset of normalizability and hence its well-foundedness 
implies the well-foundedness of normalizability. The main part of the proof lies in 
showing that all derivations in Line are reducible, and hence normalizable. This is stated 
in the Lemma 1 , of which cut-elimination is a simple corollary. 

Lemma 1 . For any derivation II of B\, ... ,B n ,r — > C, reducible derivations 
III, ... , n n of A\ — > B 1, . . . , An — > B n (n > 0), and substitutions 81, . . . , S n , 7 
such that Bi 5 i = Biy,for every i £ {1, . . . , n}, the following derivation E is reducible. 

IIi8\ II n S n ILy 

A\ 5 \ — > Bi 5 i ■■■ A n 8 n — > B n 5 n B x y, . . . ,B n y,Fy — >C 7 
a — c , — ; — — — me 

A1S1 , • • • ,A n S n ,r^ — > C7 

The proof proceeds by induction on the height of 77 with subordinate inductions on 
n and on the (well-founded) reduction tree of III, ... , 1 1„ . We give a general idea of 
the proof for the cases */2X and CTIZ/CIC in Definition 6, and refer to [31] for full 
details. In the following description, we refer to Definition 6 for the particular shapes 
of the derivations 77i and 77. In the */2 X case, it is sufficient to show that given the 
reducibility of 77i, the unfolding derivation /;(//| . Us) is still reducible. This is done 
by induction on the construction of p(IIi, 77 g). The non-trivial case is when new cuts 
(me) is introduced. But here we see that this instance of me is always cutting with 
Big, and hence by the outer induction hypothesis (77$ is of smaller height than 77) this 
instance of me is reducible. The CIIZ/CXC case is more complicated. In addition to 
showing that the co-inductive unfolding preserves reducibility, we also need to show 
that the unfolded derivation u(IIs, Us) is “closed” with respect to cut, that is, for every 
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reducible derivation F of A' — > S u, the resulting derivation obtained by cutting F 
with v{n s , n s ) [u/x] is reducible. This case is dealt with by building into the notion of 
reducibility this closure condition. 



5 Related Work 



Line has been designed as an intentionally weak logical framework [6] to be used as a 
meta-language for reasoning over deductive systems encoded via HOAS. In particular, 
it can be seen as the meta-theory of the simply typed A-calculus, in the same sense 
in which Schiirmann’s [28] is the meta-theory of LF [11]. A4 UJ is a constructive 
first-order logic, whose quantifiers range over possibly open LF objects over a signature. 
In the meta-logic it is possible to express and inductively prove meta-logical properties 
of an object logic. By the adequacy of the encoding, the proof of the existence of the 
appropriate LF object(s) guarantees the proof of the corresponding object-level property. 
It must be remarked that A4 U does not support co-induction yet. However, LF can be 
used directly to specify an inductive meta-theorem as a relation between judgements, 
with a logic programming interpretation providing the operational semantics. 

Of course, there is a long association between mathematical logic and inductive 
definitions [2] and in particular with proof-theory, possibly the earliest relevant en- 
try being Martin-Lof’s original formulation of the theory of iterated inductive defini- 
tions [ 13]. From the impredicative encoding of inductive types [4] and the introduction 
of (co)recursion [8] in system F, (co)inductive types became common [17] and made 
it into type-theoretic proof assistants such as Coq [20], first via a primitive recursive 
operator, but eventually in the let-rec style of functional programming languages, as in 
Gimenez’s Calculus of Infinite Constructions [9]; here termination (resp. guardedness) 
is ensured by a syntactic check (see also [1]). Note that this has severe limitations (e.g., 
in the possibility of using lemmas in the body of a guarded proof) that do not applies 
to our approach. Circular proofs are also connected with the emerging proof-theory of 
fixed point logics and process calculi [24,29,30], in particular w.r.t. the relation between 
systems with local and global induction, that is, between fixed point vs. well-founded 
and guarded induction (i.e. circular proofs). 

In higher order logic (co)inductive definitions are obtained via the usual Tarski fixed 
point constructions, as realized for example in Isabelle/HOL [21], As we mentioned 
before, those approaches are at odd with HOAS even at the level of the syntax. Several 
compromises have been proposed: the Theory of Contexts [12] (ToC) marries Weak 
HOAS with an axiomatic approach encoding basic properties of names. Hybrid [3] 
is a A-calculus on top of Isabelle/HOL which provides the user a Full HOAS syntax, 
compatible with a classical (co)-inductive setting. Line improves on the latter on several 
counts. First it disposes of Hybrid notion of abstraction , which is used to carve out the 
"parametric” function space from the full HOL space. Moreover it is not restricted to 
second-order abstract syntax, as the current Hybrid version is (and as ToC cannot escape 
from being). Finally, at higher types, reasoning via defC is more powerful than inversion: 
for example Vy.Xx.y f A.c.O is provable in Line, but fails both in Isabelle/HOL and 
Coq - the latter for extensionality reasons. 
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6 Conclusion and Future Work 

We have presented a proof theoretical treatment of both induction and co-induction in 
a sequent calculus compatible with HOAS encodings. The proof principle underlying 
the explicit proof rules is basically fixed point (co)induction. Our proof system is, as far 
as we know, the first which incorporates a co-induction proof rule and still preserves 
cut-elimination. We have shown several examples where informal (co)inductive proofs 
using invariants and simulations are reproduced formally in Line. Consistency of the 
logic is an easy consequence of cut-elimination. 

We currently have two prototype implementations of Line. The one in the Hybrid sys- 
tem [3, 1 9] is better characterized as an approximation: definitional reflection is mimicked 
by the elimination rules of (co)inductive definitions, which also provides (co)induction 
principles, while the Hybrid A-calculus takes care of the freeness properties: notwith- 
standing the limitations mentioned in Section 5, the implementation has the benefit of 
inheriting all the automation of Isabelle/HOL on whose top Hybrid is realized. The 
second is a direct implementation of Line rules in AProlog, with a Java graphical user 
interface (available on the web at http : //www . lix . polytechnique . f r/~tiu). This 
prototype is currently limited to be a proof-checker. A serious implementation would 
require more study on the proof search properties of Line. It is true that with induction 
and co-induction there is no hope of automation in general. Nevertheless, a large subset 
of the logic may still admit some uniformity in proof search. 

On the theoretical level, we conjecture that the proviso in the CTR rule can be 
eliminated. Similarly, we can loosen the stratification condition for example in the sense 
of local stratification and of terminating higher-order logic programs [23], possibly 
allowing to encode proofs such as type preservation in operational semantics directly in 
Line rather than with the 2-level approach [15,19], 

Another interesting problem to investigate is the connection with circular proofs 
which is particularly attractive from the viewpoint of proof search, both inductively and 
co-inductively. This could be realized by directly proving a cut-elimination result for a 
logic where circular proofs, under termination and guardedness conditions completely 
replace (co)inductive rules. Alternatively, we could reduce “global” proofs in such a 
system to “local” proofs in Line, similarly to [30]. Finally, extensions of Line, for 
example in the direction of FO A v [18] or the regular world assumption [28] are worth 
investigating. 
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Abstract. In this paper we present the Coq formalisation of the QArith 
library which is an implementation of rational numbers as binary se- 
quences for both lazy and strict computation. We use the representation 
also known as the Stern-Brocot representation for rational numbers. This 
formalisation uses advanced machinery of the Coq theorem prover and 
applies recent developments in formalising general recursive functions. 
This formalisation highlights the role of type theory both as a tool to 
verify hand-written programs and as a tool to generate verified programs. 



1 Introduction 

The present work is the continuation of two earlier parallel works of the au- 
thors [3,13] with two principal objectives: 

1. To present a library of rational numbers for Coq [6] based on a canonical 
representation for rational numbers also known as Stern-Brocot representa- 
tion 1 . 

2. To verify in Coq the correctness of a family of lazy algorithms for exact 
rational arithmetic. 

In the present paper we do not detail the lazy algorithms that are described 
in [13]. For the complete formal development, we refer the reader to [14]. 

In Sect. 2 we present the set of rational numbers as an inductively defined set 
of signed binary sequences. In Sect. 3 we describe strict algorithms for the field 
operations. In Sect. 4 we describe lazy algorithms for these operations, based on 
lromographic and quadratic transformations. In Sect. 5 we discuss the proof of 
correctness for these algorithms. In Sect. 6 we discuss the question of program 
extraction as it is provided in the Coq type theory and the impact this question 
had on our formal work. Possible further work is mentioned in Sect. 7. 



1 A presentation of the Stern-Brocot trees and related publications is given in [13]. 
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2 Rational Numbers as Binary Sequences 

Given a positive fraction -, its Stern-Brocot representation is a finite binary 
sequence, consisting of the letters L and R and is characterised by the following 
encoding (left) and decoding (right) functions ([] denotes the empty sequence): 





[° 


in = n, 


'[[]] 


: =1, 


rrn~ 1 ._ < 
n 


L r rn ~\ 
n—m 


to < n, < 


IM 


._ M 
• H+i ’ 




R rm-n-i 

< n 


to > n . 




:= H + 1 



One can formalise this in Coq by first defining the set of positive rational numbers 
and then the entire set of rational numbers, inductively as: 

Inductive Q+ : Set := nR: Q + — »Q+ I dL: Q + — »Q+ I One: Q+ . 

Inductive Q : Set := Zero: Q I Qpos: Q+— »Q I Qneg: Q + — >Q. 

We formalise the above encoding function that maps pairs of natural numbers 
p and q to the binary sequence representing |. The recursion in this function is 
bounded by an extra measure argument. 

Fixpoint Qj~ (p q n : nat) (struct n} : Q+ : = 
match n with 
I 0 => One 

I S n’ 4 match p - q with 

I 0 =r* match q - p with I 0 =r* One I v => dL (Q^ p v n’) end 

I v 4 nR (Q^ v q n’) 

end 

end. 



If either of p or q is zero, the outcome of this function is irrelevant. The 
encoding function for arbitrary rational numbers always calls Q+ with positive 
input. 

Definition makeQ (m n : Z):= 
match m, n with 

I Zpos Zpos _ => Qpos (Q^ (abs m) (abs n) (abs m) + (abs n)) 

I Zneg Zneg _ => Qpos (Q^ (abs m) (abs n) (abs m) + (abs n)) 

I ZO , _ => Zero 

I , ZO => Zero 

I , _ => Qneg (Q^~ (abs m) (abs n) (abs m) + (abs n)) 

end. 



Here the function abs is the forgetful projection from Z onto nat ( Coq natural 
numbers). Thus for p and q two integers, makeQ p q produces the signed binary 
sequence corresponding to For example: 

Eval compute in makeQ 9 14. 

= Qpos (dL (nR (dL (nR (nR (nR One)))))) : Q 

Decoding functions have a similar structure, with a main decoding function 
for arbitrary rational numbers and a recursive function for positive rational 
numbers. Here the function Z_of _nat is the trivial injection of nat into Z. 
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Fixpoint qL (w : Q+) : nat * nat : = 
match w with 
I One => (1, 1) 

I nR w’ => match Q+ w’ with (p,q)=> (p + q, q) 

I dL w’ => match Q+ w’ with (p,q)=> (p, p + q) 
end. 

Definition decodeQ { q : Q):= 
match q with 

I Qpos p => (Z_of_nat (fst (Q+ p) ) , Z_of_nat (snd (q)~ p))) 

I Qneg p => ((- Z_of_nat (fst {Q~f p) ) ) , Z_of_nat (snd (Qt~ p))) 

I Zero => (0, 1) 

end. 



We also proved that encoding and decoding are inverse operations. 



Lemma makeQ decodeQ :Vq:Q , makeQ (fst {decode!) q) ) (snd {decode!) q))=q. 



Note that this equality between the resulting signed binary sequence and the 
original sequence is syntactical (Leibniz equality). For the converse direction we 
can prove the following lemma: 



Lemma decodeQ makeQ :\/m n:Z, let (p,q) : = {decodeQ {makeQ m n)) in n^O— >m*q=n*p . 



Here the equality is not syntactical, rather it is the definitional equality on 
positive fractions. These lemmata show the advantage of our binary represen- 
tation for rational numbers. In a system like Coq , reasoning with data types 
is considerably easier when we are dealing with the corresponding syntactical 
equality; we can use the rewriting machinery of the theorem prover to ease the 
equational reasoning. But the benefits of this canonical representation are not 
restricted to machinery of the theorem provers. For a more detailed discussion 
and examples of simplified mathematical proofs see [3] . 

The lemmata makeQ decodeQ and decodeQ makeQ demonstrate that the in- 
ductively defined set Q is a representation for rational numbers. In the rest of 
this paper we will show how we can equip the set Q with the usual algebraic 
operations and prove the correctness of these operations. 

3 Field Structure: A Strict Implementation 

In this section, we present the formalisation of algebraic operations on Q in the 
natural mathematical way. When computing an operation with rational numbers, 
mathematicians usually perform regular natural number computations with the 
numerators and denominators and then simplify the result to a reduced fraction, 
using a greatest common divisor computation. For this reason, we shall use the 
term fraction to denote the pair of a numerator and denominator. 

In our case, we start with values in the type Q and we use the function 
decodeQ to obtain fractions for each operand and then perform the usual natural 
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number computations. At the end of the computation, the resulting fraction is 
directly encoded using the function Q+ , because this function already integrates 
the greatest common divisor computation, as was shown in [3]. 

We only provide operations to manipulate positive fractions. The encapsulat- 
ing functions which take care of conversions between the type Q take care of sign 
problems. This means that we need to provide three basic operations for frac- 
tions: addition, multiplication, and subtraction. Both addition and subtraction 
on fractions with positive components are needed for the addition on rational 
numbers, because adding two numbers with opposite signs tantamounts to a 
subtraction. Computing the result sign for an addition when the two numbers 
have opposite sign also requires a function to compare two rational numbers. We 
do not implement this comparison function at the level of fractions but rather at 
the level of the type Q + . For multiplication, the situation is simpler, multiplying 
rational numbers reduces to multiplying the absolute values and then computing 
the sign of the result. 

Comparing two numbers in the type Q + is simple. The constructors nR and 
dL can actually be interpreted as monotonically increasing functions; the former 
always returns a result greater than 1 while the second one always returns a result 
less than 1. Thus, it suffices to compare the two bits bitwise from left to right. 

Fixpoint Q + _le_bool (w w’ : Q + ) (struct w’} : bool : = 
match w with 

I One =£* match w J with I dL y =^> false I _ =>■ true end 

I dL y =>• match w 5 with I dL y’ =>■ Q“*“_le_bool y y’ I =>■ true end 

| nR y =^> match w 5 with | nR y’ => Q + _le_bool y y’ I _ =>• false end 

end. 



This function is used to define a two argument predicate Q + _le and a strongly 
specified test function Q + _le_dec which plays the key role in the operations’ 
implementations, because the subtraction operation is only meaningful when 
the first argument is greater than the second argument. 

Definition Qplus (x y:Q):= 

match x, y with I Qpos x’, Qpos y’ =>■ Qpos (Q + _plus x’ y’) 

I Qpos x’, Qneg y J =>• 

match Q + _le_dec x’ y’ with 
I left h => 

match Q + _eq_dec x’ y’ with 
I left h => Zero 

I right h => Qneg (Q + _sub y’ x’) 
end 

I right h => Qpos (Q”*~_sub x’ y’) 
end 

I Qneg x’, Qneg y 5 => Qneg (Q + _plus x’ y’) 



The unary operation of computing the opposite of a rational number is a 
trivial matter. To compute the inverse, we need to compute the inverse of a 
positive integer in Q + . It actually is a very simple function, where we do not 
need to convert to a fraction of natural numbers and back. 
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Fixpoint Q + _inv (w:Q + ):Q+: = 
match w with I One => One 

I nR w' ^ dL (Q"Ainv w’) 
I dL w ] nR (Q'Ainv w’) 

end. 



With all these operations, it is then quite easy to show that the type 
Q with comparison, addition, subtraction, multiplication, and inversion is an 
Archimedean ordered field. We use this ‘natural’ implementation as a refer- 
ence implementation (but not a very efficient one) of the field operations (see 
Sect. 5.1). 

Most of these algorithms for the basic operations are strict; in the sense 
that we need to process the entire bit strings of both arguments to obtain the 
numerators and denominators and start computing the result. Only the inverse 
and comparison function can start to return results without having processed 
their entire input. 

4 Field Structure: A Lazy Implementation 

4.1 Laziness 

Lazy computation is a constructive interpretation of continuity 2 . The idea is that 
if we are computing continuous functions on sequences, it is possible to do this 
computation in a lazy manner, outputting partial information about the final 
result after having processed only initial segments of the input. If we consider 
streams (infinite sequences) instead of finite sequences, we can make this notion 
more precise by calling a function on streams lazy if it is continuous with respect 
to the Cantor space topology on the set of streams. In our case the operations 
addition, multiplication, division and subtraction are all continuous both on R. 
and on Q with the subspace topology, and we can devise lazy algorithms for 
these operations. However, the inputs we consider are finite and this explains 
why we could provide a strict implementation. 

A lazy algorithm on sequences usually consists of two steps: (1) looking at 
initial segments of the input, the absorption step; (2) outputting an initial seg- 
ment of the output, the emission step. An algorithm terminates when it emits 
the empty sequence. When there are several inputs, the algorithm also contains 
an absorption strategy to decide which input initial segment to absorb next. 
Classical examples of lazy algorithms are those given by Gosper [8] for adding 
and multiplying regular continued fractions. The work by Gosper was later gen- 
eralised for exact real arithmetic [18,12,17,9]. 

We devised lazy algorithms to compute directly on the Q + and Q structures 
without going through computations on fractions; we then showed their equiv- 
alence with the strict algorithms from Sect. 3. One can show that the lazy 

2 There are other aspects of laziness (e.g. laziness in the sense of sharing the reduction) 
which we do not consider in this paper. 




314 



M. Niqui and Y. Bertot 



approach should have a lower computational complexity, especially when partial 
answers are useful (for instance in the case of dealing with the fractions with 
large denominators). But we discover in this work that the proof complexity has 
an impact on the usability of the algorithms and the strict approach is still the 
most efficient one for some purposes (see Sect. 6.3). 



4.2 Homographic and Quadratic Algorithms 



In this section we briefly discuss the homographic and quadratic algorithms for 
computations on signed binary sequences. The algorithms that we use are ex- 
plained in detail in [13] and are available on-line as part of the Coq contribution 
package QArith [14]. Following [8], to devise the basic field operations, we con- 
sider a larger class of unary and binary operations. The basic operations are 
then simultaneously obtained from the general algorithms. 

A homographic transformation of matrix M is a function of the form 

a b 
c d 

A quadratic transformation of matrix T is a binary function of the form 



h M {x) = 



ax + b 
cx + d 



a, b.c.d £ Z; 



M = 



qr{x,y) 



axy + bx + cy + d 
exy + fx + gy + h 



a,b,c, d, e, f,g, h £ Z 



and T = 



abed 
e f g h 



By taking the following special values for T we obtain the algorithms for basic 
arithmetic operations: 





'0 110 ' 




'1 0 0 0 ' 




'0 1 -1 o ' 




'0 10 0 ' 


T ® = 


° ° 0 1 _ 


j T® — 


0 0 0 1 


J 2 © — 


0 0 0 1 


, T® — 


0 0 1 0 



In [13] the homographic algorithm is presented using two auxiliary algorithms: 
the sign algorithm and the output-bit algorithm. The sign algorithm is a function 
S : M 2x2 (Z)xQ+ — >{0,+1,-1}xM 2X 2(Z)xC! + (here M 2x 2 (Z) denotes the type 
of 2 x 2 matrices over Z) . This means that the outcome of the sign algorithm is a 
triple consisting of the sign, a matrix of coefficients, and the unused part of the 
input sequence. The output-bit algorithm is a function B: Af 2x2 (Z)xQ + — >-Q + 
which outputs an unsigned binary sequence. Finally the homographic algorithm 
is a function H: M 2x2 ( Z)xQ + — >-Q which combines the two functions S and 
B. Both functions S and B are recursive. In the case of S we are dealing with 
a simple structural recursion on the binary sequence. The recursion in B is 
more complex. If the recursion was structural, then all recursive calls would be 
absorption steps. In our case, some recursive calls are only emission steps, but 
the total sum of the matrix coefficients decreases while remaining positive. This 
is one of the main difficulties of the verification process; in Coq , formalising 
non-structural yet terminating recursion is possible in various ways. But all of 
the methods either require a priori knowledge of the algorithm complexity (for 
example the Balaa and Bertot’s method [1]) or lead to very large proof terms 
by changing the representation of the function’s domain (for example the Bove 
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and Capretta’s method [5]). In Sect. 4.4 we explain how we used a variant of 
Bove and Capretta’s method in our formalisation to formalise the non-structural 
recursion. 

Similarly in the case of the quadratic algorithm, the sign algorithm is a func- 
tion <S 2 • T 2x4 (Z)xq+xQ+ — >■{(), +1, — 1 }xT 2X 4(Z)xQ + xQ + in which T 2X 4 (Z) 
is the type of 2x4 integer matrices. The output-bit algorithm is a function 
£> 2 : T 2X 4 (Z)xQ + xQ + — >-Q+; consequently, the quadratic algorithm will be for- 
malised as a function Q : T 2x 4 (Z)xQ + xQ + — s-Q which combines the two func- 
tions S -2 and B 2 ■ The function <S 2 is a structurally recursive function with respect 
to the binary structure of both inputs while the function £> 2 is not. 

4.3 Lazy Proof Obligation 

In [13] we implicitly assumed that the denominators of all transformations in- 
volved are nonzero. This imposes a restriction on the formalisation because it 
makes the algorithms partial. A standard way to formalise partial functions in 
type theory is to add a proof obligation to the function’s arguments, using a 
predicate to specify the function’s domain. For the lromographic algorithm H, 
the domain predicate has the form 

\ci,b,c,d: Z;q: Q. <P n (c,d,q) . 

The predicate <Py : Z 2 xQ— >-Prop means c*q+dj^ 0, but we want to avoid using 
the strict operations to define the predicate. We first define a domain predicate 
for the sign algorithm as an inductive property of triples ( c,d,p ): ZxZxQ + , 
using only operations on integers and pattern matching on the first bit: 

Inductive Pg : Z— ¥Z— »Q + — rProp : = 

I As 0 :V (c d : Z) (p:Q+), p = One— >c+d^0— » Pg c d p (* p=0ne *) 

I Pg j:V (c d : Z) (xs:Q + ), Pgc c+d xs-> Pg c d (nR xs) (* p=(nR xs) *) 

I Ps 2 :V (c d : Z) (xs:Q+), ^c+d d xs-> Pg c d (dL xs) . (* p=(dL xs) *) 

The domain predicate for TL is obtained by adapting d>s according to the 
sign bit of each rational number: 

Inductive P y (c d : Z) (q:Q):Prop: = 

I P-ho : q = Zero— »d^0 — ¥ Py^ c d q 

I P'n\'.'i p:Q + , q = Qpos p—» Pg c d p — ¥ Py c d q 

I p:Q + , q = Qneg p — ¥ Pg -c d p — ¥ Py c d q. 

There is also a domain predicate for the output-bit algorithm, but that one 
is a consequence of the accessibility predicate (Sect. 4.4). 

The precise type of the Coq formalisation of the lromographic algorithm be- 
comes H:V (a b c d:Z) (q:Q) c d q— >• Q. Since the type of {d>u c d q) 
is Prop, the proof obligation is removed during the extraction (Sect. 6.1), and 
the extracted programs is close to those given in [13]. The same technique is 
used for the quadratic algorithm and the Coq formalisation of the function has 
the type Q:V (ab c de f gh:Z) (ql q2:Q e f g h ql q2— > Q. 

We need only one lemma to prove that the usual field operations satisfy the 
proof obligations: 
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Lemma addmultPrf :V (x y:Q), <Pq 0 0 0 1 x y. 

Definition QplusLazy (x y:Q):= Q 0 1 1 0 0 0 0 1 x y ( addmultPrf x y) . 
Definition Qmult Lazy (x y:Q):= QlOOOOOOlxy (.addmultPrf x y) . 



4.4 Accessibility 

The mathematical argument to ensure that the output-bit algorithm terminates, 
is that the length of the input sequence decreases in absorption steps and that 
the total sum of matrix coefficients decrease in emission steps. Checking that 
the total sum is decreasing is not syntactically possible impromptu. We follow 
Bove and Capretta’s method of formalising general recursive functions and define 
the function’s domain as an inductive predicate, so that the algorithm can be 
described as a structural recursive function with respect to this predicate. This 
method is also known as recursion on an ad hoc predicate. For the reasons that 
we discuss in Sect. 6.2, we use a variant of the method that was suggested by 
Paulin [15] and further explored in detail in [4]. The domain of the function B 
will be quotiented by b, c, d,p), which is an inductively defined predicate 

that determines which of the 5-tuples (a, b , c, d, p) are accessible for the recursive 
branches of the output-bit algorithm (cf. [13, Def. 4.3]): 



Definition isAbove (a b c d:Z) : = (c<aAd<b) V(c<aAd<b) . 

Inductive : Z— >-Z— >-Z— >-Z— >-Prop : = 

I ^'ho • V (a b c d : Z) (p : Q”*~), p=0ne— >-0<a+b— >-0<c+d— >• a b c d p 

I ^hi :V (a b c d : Z) (p : Q + ), p^One— >• isAbove a b c d— >■ 
ty-U (a-c) (b-d) c d p— >• a b c d p 

I (a b c d : Z) (p : Q + ), p^One— >--i isAbove a b c d— >■ 

isAbove c d a b— >• a b c-a d-b p— >• #7^ a b c d p 
I ^-H3:V (a b c d : Z) (xs : Q”*~), — ■ isAbove a b c d— ■ isAbove c d a b— >• 
ty-U a a+b c c+d xs— >• abed (nR xs) 

I $ 7 ^ 3 / :V (abed : Z) (xs : Q + ), — > isAbove a b c d— »— i isAbove c d a b— >• 
a+b b c+d d xs— >• abed (dL xs) . 



We use this accessibility predicate to formalise the homographic output-bit 
algorithm. This function’s type becomes 



Q + _to_Q+: V (a b c d : Z) (p : Q+) , a b c d p Q+ . 



After defining this function in Coq , each time we want to use it we should 
supply a term H_acc : a b c d p).But we know that all the positive values 

of a, 6 , c and d are in the function’s domain: 



Lemma :V (a b c d:Z) (p:Q + ), 

0<a+b— m<c+d— >-0<a— >-0<b— >-0<c— >-0<d— >• a b c dp. 



We prove this lemma by well-founded induction on the intrinsic order of the 
accessibility predicate. We denote this order by <5 and we define it as follows. 

4 4 

(ai, a 2 , a 3 , a4, p)<5(a[, a 2 , a 3 , a 4 , q) iff len(p) < len(g) V (p=q A cn< a-) . 
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We prove that this order is well-founded on the set Z +4 xQ + (where Z + is the 
set of nonnegative integers); the lemma l I /r n Wf is a direct consequence. 

We take the similar approach for the formalisation of the output-bit algorithm 
for the quadratic function. There we have an accessibility predicate 'Pq on the set 
Z 8 xQ + . The well-founded order corresponding to this accessibility predicate is 
an order on 10-tuples which is well founded on the set Z +8 x Q + “. Consequently 
we use well-founded induction to prove the following lemma: 



Lemma 'I'Q-Wf :V (a b c d e f g h:Z) (pi p2:Q+), 

0<a+b+c+d— »0<e+f +g+h — ¥ 0<a— »0<b— »0<c— »0<d— » 
0<e— »0<f — »0<g— »0<h — ¥ <Pq abcdefghplp2. 



This lemma shows that we can use the output-bit algorithm (the function 
£> 2 ) to compute the quadratic transformations with nonnegative coefficients and 
with at least one positive coefficient in the numerator and one positive coefficient 
in the denominator. In order to compute the quadratic transformations with 
negative coefficients and on negative inputs we use the quadratic algorithm (the 
function Q) to modify the coefficients with respect to the sign bit and call the 
output-bit algorithm with nonnegative coefficients. 



5 Correctness Proofs 

5.1 Using Strict Implementations 

In Sects. 3 and 4 we showed how we formalised strict and lazy arithmetic op- 
erations on the data type Q. In this section we discuss how the lazy algorithms 
were formally proven to be correct. One possible approach is similar to what we 
did for strict operations: to prove that the lazy operations satisfy all the axioms 
of a field. The second possibility is to use the strict algorithms as a specification 
for the lazy ones, with lemmata of the following form: 



Lemma Qplus Lazy _Qplus :V (x y:Q), QplusLazy x y = Qplus x y. 



In our development we took this approach but we proved more general results. 
The one for the quadratic algorithm has the following form, momentarily using 
® and (g) do denote the strict operations. 



Theorem Q_Correctness :V (a b c d e f g h:Z) (ql q2:Q) 

(hyp: (e®qlig)q2)ffi(f(g>ql)ffi(g(g>q2)© h ^ Zero), 
Qabcdefghqlq2 (.Q_nonzeroCorrect e f g h ql q2 hyp) = 

( (a®ql(g>q2) © (b®ql) © (c(g>q2) ffid) (g )(Qinv ( (e®ql(g>q2) © (f ©ql) © (g®q2) ffih) ) . 



The correctness of field operations is just a special instance of this general 
theorem. In the statement of the above theorem, the term Q_nonzeroCorrect 
corresponds to the correctness of the lazy proof obligations that we defined in 
Sect. 4.3. 
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5.2 Using the field Tactic 

The field tactic was devised by Delahaye and Mayero [7] to ease the equa- 
tional reasoning on field structures in Coq. It is a decision procedure for simple 
equations that generates proof obligations for each occurrence of the division. 

We could use this tactic directly after proving that the strict operations 
of Sect. 3 satisfy the field axioms. It was of great help in the correctness proof; 
however, there were instances where we had to fine-tune the reduction behaviour 
of the field tactic to prevent unnecessary reduction that slows down the tactic 
behaviour to an unacceptable level. The default reduction behaviour is based 
on the eager reduction, probably because the original design was based on an 
axiomatic field structure. Our experience shows that the field tactic will be 
more useful for equational reasoning on concrete fields (as opposed to abstract 
axiomatic fields), if this reduction behaviour is less eager. Note that the Ring 
tactic, which is the Coq tactic for equational reasoning on rings, does not have 
this problem with eager reduction; hence it is very useful in reasoning about 
concrete rings such as the ring of integers. 

After proving the correctness of the lazy algorithms, we could define a sec- 
ond field structure based on lazy operations. Therefore we have a single data 
type with two different field structures on it. This is an interesting situation; it 
deserves a deeper investigation to see whether it is useful — from the theorem 
proving point of view — to have two underlying fields on the same carrier type, 
or whether it adds to the complexity of the proofs. 



5.3 Functional Induction 

As it is obvious from the quadratic and homographic algorithms given in [13] 
and the formalisation we discussed in Sect. 4, we are dealing with functions of up 
to 11 arguments (in quadratic algorithms) which are defined by case distinctions 
of up to 43 cases (in lromographic and quadratic sign algorithms). The case 
distinctions in the definition of functions gets in the way when we want to prove 
these functions’ properties. This means that if we want to prove the correctness 
of the homographic sign algorithm, we should consider 43 different cases. During 
the proof of the correctness many of these cases should be handled in a similar 
way; they can be solved by automatic tactics or are degenerate cases. The tactic 
functional induction is designed by Bartlre and Courtieu [2] to assist the user 
in dealing with these situations by providing some automation. When given a 
Coq function, the tactic functional induction tries to automatically generate 
an elimination principle which is tailored to the shape of that function. It then 
applies this elimination principle on the current goal generating all the possible 
different cases based on the case distinctions in the definition of the function. This 
will generate one subgoal for each case; the tactic then applies some heuristics to 
solve as many subgoals as possible. In our project, in proving the correctness of 
the lazy operations, we benefited immensely from the beta version of this tactic. 
Our usage also contributed in making this tactic more efficient by exposing some 
of the bugs of that version. Our experience shows that this tactic can make Coq a 




QArith: Coq Formalisation of Lazy Rational Arithmetic 319 



better framework for reasoning about realistic algorithms which are often based 
on heavy case analysis on a multitude of arguments. 



6 Programs versus Proofs 

The algorithms for lazy arithmetic on Q + were first implemented in Haskell. The 
original Haskell implementation was about 16 kilobytes of code. The Coq for- 
malisation of the lazy algorithms led to fixing some exception handling bugs in 
the original Haskell code. Moreover the Coq formalisation highlighted the sym- 
metries between fractions, homograplric transformations and quadratic transfor- 
mations as members of the larger family of multilinear functions. This resulted 
in generalising the algorithms for multilinear forms in n variables [13]. Such im- 
provements show some advantages of formalising functions in type theory. There 
is however, the disadvantage that formalising the programs in type theory is a 
time consuming process; the amount of automation and heuristics available in 
present day theorem provers is far from being satisfactory. 

Table 1 shows the relative size (in kilobytes) of the various phases of for- 
malisation. During this project we used the most novel facilities of Coq. The 
statistics in Table 1 might thus discourage people from using type theoretical 
tools for verification purposes. Our answer is that, without the existing automa- 
tion tools in recent versions of Coq , and without the recent theoretical advances, 
such a project seemed impossible only a couple of years ago. This makes us con- 
fident that in the coming years, similar projects will help in improving theorem 
provers and making them more programmer-oriented and will alleviate the task 
of formalising and verification of the algorithms in theorem provers based on type 
theory. Our second argument is that such formalisations have a generic nature 
which can be applied to similar algorithms. The lazy algorithms of QArith , are 
inspired by, and very similar to the existing algorithms for arithmetic on contin- 
ued fractions [8,18,12]. We believe that a verification of the continued fraction 
arithmetic is possible based on our QArith project (see Sect. 7). 



6.1 Extraction 

One important aspect of type theory of Coq is the distinction between informa- 
tive and non-informative objects. The informative objects are terms of type Set, 
and consist of those whose computational content is important for the program- 
mer. The non-informative objects are terms of type Prop which bear solely a 
logical and not computational importance. Inside the type theory of Coq , how- 
ever, these non-informative objects are first class citizens and they should be 
type checked and evaluated when necessary. 

In order to recapture the computational content of the formalised programs, 
Coq has the program extraction mechanism. This is a tool to extract the un- 
derlying program of an object in type theory into a program in a conventional 
programming language [16,10]. 
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In our case, after finishing the formalisation of the lazy algorithms, we used 
the program extraction into Haskell , to obtain the verified version of the algo- 
rithms. It is interesting to compare the Cog-generated Haskell code ( postverifica- 
tion code) with the original hand-written Haskell code ( preverification code). Not 
surprisingly the basic algorithms have the same time and space complexity. The 
main difference is that in the postverification code all the usual data types such 
as natural numbers, booleans and integers are reimplemented as new algebraic 
data types in Haskell; while in the preverification code we use the data types 
already defined in the standard prelude of Haskell (and sometimes even built-in 
as primitive data types). This makes the preverification code much faster; it is 
also the main cause of the difference in the size of the pre- and postverification 
code (see Table 1). 



Table 1 . Comparison of the ASCII size of programs and proofs. 



development Coq code 



function preverification Coq code postverification 3 



strict operations 112 KB 
lazy operations 748 KB 
correctness 304 KB 

total project 1164 KB 



homographic 8 KB 200 KB 20 KB (8 KB) 

quadratic 8 KB 476 KB 60 KB (20 KB) 

total lazy 16 KB 748 KB 88 KB (32 KB) 



There is another difference between the pre- and post verification codes. Re- 
call that in Coq we had to add the lazy proof obligations and the accessibility 
predicates to the definition of the functions. Those terms were non-informative 
objects from the programmer’s point of view; hence they had the type Prop. 
During the extraction all the terms of type Prop will be replaced by the sole 
constructor of the unit data type, which is merely a dummy term in Haskell. 
Thus for example the lromographic function in the postverification code has 6 ar- 
guments; whereas in the preverification code it has 5 arguments. This difference 
is practically negligible and does not affect the performance of the postverifica- 
tion code. 



6.2 Prop-Sorted Accessibility 

In the Coq formalisation of function B of Sect. 4.2, in every recursive branch once 
the value of the function (based on the input sequence and four coefficients which 
are carried around) and once the new subdomain of the new recursive call will 
be evaluated. Recall that if we extract this term from Coq to a Haskell program, 
all the terms of type Set will be extracted. Originally, in Bove and Capretta’s 
method, the inductive domain of a non-structurally recursive function is a term 
of type Set. This is because Bove and Capretta work in Martin-Lof type theory 

3 The number in brackets denotes the size of the extracted code disregarding the 
extraction of the basic libraries. 
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where there is no distinction between Set and Prop. This means that if we follow 
Bove and Capretta’s method and take the domain of the function B to be an 
inductively defined set rather than a predicate, in the Haskell extraction of the 
function, the inductively defined accessible domain is also extracted; this will 
considerably decrease the efficiency. 

Incidentally that is the approach we took in the beginning. Later we modified 
the whole formalisation and we used the Prop-sorted accessibility. Our tests 
showed a 25% to 30% decrease in both time and memory usage of the extracted 
algorithms. However for evaluation inside Coq the time and memory complexity 
of the proof objects do not change. This emphasises the importance of program 
extraction as one of the basic philosophies behind the design of the type system 
of Coq compared to Martin-Lof type theory. 

We mentioned that our first approach was to follow Bove and Capretta’s 
original method and use Set-sorted accessibility. This is because unfortunately 
the second approach is more technical and requires an advanced knowledge of 
the internals of Coq [4]. The first author initially applied the original Bove and 
Capretta’s method; the second author showed how it is possible to modify the 
proofs to suit the Prop-sorted variant. During this modification a detailed study 
of the proof terms of Coq was necessary. 

6.3 Computations Inside Coq 

One of the main objectives of the project was to provide Coq with a library of 
arithmetic on rational numbers. This library had to be similar to the existing 
libraries for natural numbers and integers. This means that we should at least be 
able to perform easy computations in the language of Coq. The QArith library 
fulfills this requirement. After defining the strict operations, one can add pretty 
printer and parser for expressions involving rational numbers. This is especially 
facilitated in recent versions of Coq, where user can easily extend the grammar 
of Coq. 

In Sect. 4.1 we argued that the lazy functions on sequences are more efficient. 
This is true in a programming language like Haskell where we do not bother 
with termination checking. But could we use the lazy operations in order to 
do arithmetic inside Coql The answer is negative. The reason is that inside 
Coq all the proof obligations and accessibility predicates, albeit computationally 
irrelevant, should be type-checked and evaluated. Consequently a full evaluation 
of the quadratic function inside Coq results in an explosion of proof terms and 
with current computational power is impractical. However as we discussed in 
Sect. 6.1, the program extraction will obviate all the non-informative terms and 
one ends up with an efficient program. Therefore for computations inside Coq 
we use the strict version of field operations. 

7 Conclusion and Further Work 

The experience with formalisation of QArith library shows that the Coq theorem 
prover, in its current state, not only is a good framework for formalisation of 
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mathematical structures and their purely algebraic properties, but also is capable 
of being used to verify nontrivial algorithms. The algorithms that we formalised 
have the same underlying complexity as the state of the art algorithms in the 
field of exact arithmetic [17]. We also contrasted the preverification code ver- 
sus postverification code. This consists of starting from a hand-written code; 
formalising it in a theorem prover which offers the possibility of program extrac- 
tion; finally extracting into the programming language of the origin obtaining 
the verified executable code. We believe that a more careful investigation of the 
difference between pre- and postverification codes gives the designers of pro- 
gramming languages (resp. theorem provers) valuable insight into logical (resp. 
computational) power of their products. 

We discuss two possible extensions of our work. First is to consider another 
important non-redundant representations for rational numbers, namely the con- 
tinued fractions representation. The algorithms for continued fraction representa- 
tion are more complicated than the algorithms we formalised in QArith. project. 
Nevertheless, extending the present work one could use the intrinsic similarity 
between our algorithms and the algorithms of continued fraction arithmetic in 
order to verify the correctness of those ubiquitous algorithms. This requires a 
clever reuse of the proof objects that we supplied during the present work, in or- 
der to minimise the amount of additional effort. The recent work by Magaud [11] 
seems to provide a useful theoretical background for this approach. 

The second possible improvement on our work is to extend the inductive 
data types and the lazy algorithms on them to coinductive data types and core- 
cursive functions on them, in order to obtain a verified exact arithmetic on real 
numbers. In Haskell there is no distinction between inductive types (data) and 
coinductive types (codata); therefore, all our algorithms written in Haskell are 
valid for potentially infinite sequences. But in Coq there is a clear distinction 
between infinite and finite objects and one has to use coinductive types in or- 
der to formalise algorithms that work on streams; even though the extracted 
algorithms into Haskell will be identical to those for the finite sequences. In 
an upcoming work the first author will describe the admissible representations 
— those which come with an intuitive notion of computability induced by the 
Cantor space topology — based on the Stern-Brocot tree and formalisable by 
means of coinductive types. A problem to tackle is the syntactic constraints that 
Coq puts on the corecursive functions. These constraints are the dual of the 
constraints for the structural recursion; they require similar approaches to the 
ones we discussed in Sects. 4.4, 6.2. 
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Abstract. The need for formal methods for certifying the good be- 
haviour of computer software is dramatically increasing with the growing 
complexity of the latter. Moreover, in the global computing framework 
one must face the additional issues of concurrency and mobility. In the 
recent years many new process algebras have been introduced in order 
to reason formally about these problems; the common pattern is to spec- 
ify a type system which allows one to discriminate between “good” and 
“bad” processes. In this paper we focus on an incremental type system 
for a variation of the Ambient Calculus called M 3 , i.e., Mobility types for 
Mobile processes in Mobile ambients and we formally prove its soundness 
in the proof assistant Coq. 



1 Introduction 

Recently, due to the widespread use of the Internet and to the appearance of 
new mobile devices (PDAs, smart phones etc.), the traditional notion of com- 
puting is quickly fading away, giving birth to new paradigms. Indeed, the need of 
exchanging data and cooperatively working towards a common goal between en- 
tities moving from one location to another gives rise to new non-trivial problems. 
In order to formally describe and reason about this new computing paradigms, a 
plethora of calculi have been proposed. Among them, the Ambient Calculus [1] 
is a process algebra specifically designed in order to model mobility of agents 
in a dynamic hierarchy of domains (ambients) with local communications. The 
interest towards this calculus is witnessed by the growing number of variants 
recently proposed in the literature. 

Since the original formulation of the Ambient Calculus, many studies have 
been carried out in order to find satisfactory alternatives to the open primitive, 
i.e, the capability allowing to dissolve an ambient revealing its internal structure. 
Indeed, this is considered a potentially dangerous operation since an agent could 
maliciously destroy from the outside a domain containing processes operating 
on sensitive data. 

In this paper we focus on a variant of the Ambient Calculus (originally in- 
troduced in [3]) which allows inter-ambient communication replacing the open 
primitive with a “to” instruction which can move lightweight processes (i.e., lists 
of capabilities) without the need of enclosing them into an ambient. On top of 
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the language there is also a type system which regulates the behaviour of pro- 
cesses. Indeed, type systems are essential components in many ambient calculi 
because they allow to discriminate between “good” processes and “bad” ones; 
this is extremely important when one wants to model security properties in the 
global computing framework. A very good feature of the type system introduced 
in [3] is that it allows to type components in incomplete environments, i.e., it is 
incremental. Moreover, there is a type inference algorithm which can be used on 
a “raw” term in order to derive the minimum requirements for accepting it as a 
good process, which provides also a notion of principal type. 

The ultimate result of this paper is the formally certified correctness proof of 
the type inference algorithm. This formal proof was carried out in the Coq system 
(developed at the INRIA research institute [8]) incrementally with the definition 
of the type inference rules introduced in [3] and in few occasions it actually 
suggested the correct formulation of the inference rules themselves. Completeness 
will be addressed in a future work. 

We capitalize on the Higher-Order Type Theory featured by Coq ap- 
proach [2, 5, 9, 7]. In particular, we use Higher-Order Abstract Syntax (HOAS) and 
we represent binders by means of lriglrer-order (i.e., functional) constants. We 
encode the typing and inference rules in natural deduction style. Thus, we avoid 
an explicit encoding of many tedious mechanisms like, e.g., alpha-conversion, 
schemata instantiation, side conditions about the freshness of bound variables 
and the treatment of typing environments by means of lists. 

We capitalize also on some interesting features of Coq such as the Leibniz 
equality and the associated Rewrite tactic in order to deal with unification. 

Synopsis. In Section 2 we introduce the M 3 calculus, the typing system and 
the related type inference mechanism presented in [3]. Each notion of the object 
language is followed by the description of the corresponding representation in 
Coq. Section 3 is devoted to the formal derivation in Coq of the soundness of 
the type inference algorithm. Finally, in Section 4 we draw some conclusions. 



2 M 3 

In this section we briefly recall the syntax of the M 3 calculus together with the 
corresponding higher-order encoding in Coq; for further details and application 
examples about the object language, the interested reader is referred to [3]. We 
will skip the notion of structural equivalence and the reduction semantics of the 
calculus since they are not relevant for our purposes. 



2.1 The Object Language and Its Encoding 

There are four basic syntactic categories, i.e., ambient names, groups, capabilities 
and processes which are annotated with types (see Section 2.2). Capabilities 
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(messages) are defined by the following grammar: 

M, N, L ::= 

m,n, . . . ,x,y,... ambient names, variables 
in M moves the containing ambient into 

ambient M 

out M moves the containing ambient out of 

ambient M 

to M goes out from its ambient into a sibling 

ambient M 

M.M' path 

For what concerns processes, the reader should notice that the restriction 
operator on groups is polyadic, i.e. , it binds several groups at once. According 
to the authors of [3], this is needed since groups can have mutually dependent 
group types. The grammar defining processes is the following: 

P, Q, R ::= 

0 null 

M.P prefixed 

(. M).P synchronous output 

(x:W).P typed input 

P\Q parallel composition 

M [P] ambient 

\P replication 

( vn:amb{g))P name restriction 

(v{g:G}(k))P group restriction 

In order to “reconcile” the inductive features of Coq with the HOAS-approach, 
we represent the syntactic categories of names and groups by means of two 
parametric types: 

Parameter name : Set . 

Parameter group: Set. 

Thus, there is no risk of deriving exotic terms (i.e., legal terms which do not 
correspond to an entity of the object language [4]). Specific names and groups are 
rendered by means of Coq metavariables of type name and group, respectively. 
The encoding of capabilities is straightforward: 

Inductive cap: Set := 

name2cap : name -> cap 
I In : cap -> cap 
I Out : cap -> cap 
I to : cap -> cap 
I path : cap -> cap -> cap. 
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For what concerns processes, there is a small issue to overcome if we want to 
stick to the HOAS-approaclr. Indeed, we said that the restriction operator on 
groups is polyadic; hence, since the A-abstraction operator of the type theory 
underlying the Coq system is monadic, we have to use a mutual inductive type 
in order to “mimick” a simultaneous group restriction: 



Mutual Inductive proc: Set := 
nil : proc 

I action : cap -> proc -> proc 

I output : cap -> proc -> proc 

I input : (name -> proc) -> msgType -> proc 
I par : proc -> proc -> proc 
I ambient : cap -> proc -> proc 
I bang : proc -> proc 

I nu : group -> (name -> proc) -> proc 
I nuG : res -> proc 

with res: Set := 

proc2res : proc -> res 
I resG : groupType -> (group -> res) -> res. 

The role of terms of type res is to encode group restrictions by grouping together 
several monadic abstractions. For instance, the M 3 process (ug 1 : Gi,g 2 : G 2 ) 0 
is encoded by (nuG (resG G1 [gl : group] (resG G2 [g2 : group] (proc2res 
nil)))). 

Since cap and proc are inductive types, the Coq system automatically pro- 
vides for free inductive and recursive principles, which are very useful in order 
to speed up the activity of the formal development of the metatlreory. 

Notice how the encoding of the calculus, following the principles for encoding 
syntax in LF (originally proposed in [5]) and the standard specification language 
provided by type theory, enhance the readability of the original presentation of 
the calculus. It is often the case that LF encodings enhance the syntax and allow 
to eliminate unnecessary idiosyncrasies. 

In this paper we are only interested in the type inference algorithm; hence, we 
do not recall the notions of structural congruence and of the reduction system. 
The interested reader is referred to [3] for more details. 



2.2 Mobility Types and Their Encoding 



In order to avoid dependent types, in [3] the authors adopt an approach based 
on groups; hence, there are basically three categories of types: ambient types, 
capability types and process types. Groups are denoted by the letters g,h , . . . , 
sets of groups are denoted by S,C,£, . . . and the syntax for the M 3 types is 
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defined as follows: 

Amb ::= amb(g) ambient type: ambients of group g 

Pro ::= proc(g) process type: processes that can stay in ambients 

of group g 

Cap ::= Pro\ —> Pro 2 capability type: capabilities that, prefixed to 
a process of type Pro\ turn it into a process 
of type Pro 2 

message type 
Amb ambient type 

Cap capability type 

communication type 
shh no communication 

W communication of messages of type W 

G ::= gr(S, C, £, T ) group type 

We recall from [3] that the meaning of the statement g : gr(S,C,£,T) is the 
following: 

— S is the set of ambient groups where the ambients of group g can stay; 

— C is the set of ambient groups that the ambients of group g can cross; 

— £ is the set of ambient groups that lightweight ^-processes can enter; 

— T is the communication type of < 7 - ambients. 

If G = gr(S,C,£,T), the notation S(G), C{G), £(G) stands for the components 
S , C, £ of G. Following the notational remark at page 7 in [3], saying that we can 
simply write g both for amb(g) and for proc(g) since the distinction is always 
clear from the context, the encoding in Coq of message and communication types 
is straightforward: 

Inductive msgType : Set := 

amb_type : group -> msgType 
I cap_type : group -> group -> msgType . 

Inductive comType : Set := 

Shh : comType 

I msg : msgType -> comType . 

Group types are encoded by the following predicate, featuring only one construc- 
tor: 

Inductive groupType : Set := 

gr: Glist -> Glist -> Glist -> comType -> groupType. 

For the sake of simplicity we chose to render in Coq the sets of group names 
S,C,£ , occurring in a group type, by means of lists of elements of type group: 



W ::= 
T ::= 




Mobility Types in Coq 329 



Inductive Glist : Set := 
emptyG : Glist 

I consG : group -> Glist -> Glist. 

In order to avoid an explicit treatment of typing environments as lists of 
typing statements, we render them by means of two parametric judgments: 

Parameter type_group: group -> groupType -> Prop. 

Parameter type_name: name -> group -> Prop. 

such that (type_group g G) holds iff g has group type G and (type_name n g) 
holds iff the name n has type g in the current environment. For instance, the 
environment {g : G, n : g} is rendered in Coq by declaring the following: 

Parameter dg: (type_group g G) . 

Parameter dn: (type_name n g) . 

This choice, followed by the rephrasing of the sequent style rules of the typing 
system (see Figure 1) in natural deduction, completely delegates to the Coq’s 
metalanguage the treatment of environments. As an example of the mapping 
from sequent style typing rules to natural deduction style ones, let us consider 
the case of (AMB RES) in Figure 1: 

T, m:g' b P:g 
r b (ym:g')P:g 

the corresponding Coq encoding is the following: 

good_proc_res : (P:name->proc) (g,g’ : group) 

( (m:name) (type_name m g J ) -> (good_proc (P m) g))-> 
(good_proc (nu g’ P) g) 

Notice how the premise F, m:g' b P:g is represented by the hypothetical judgment 
( (m:name) (type_name m g ’ ) -> (good_proc (P m) g)), 

where (type_name mg 1 ) corresponds in Natural Deduction to the discharged 
hypothesis in the following rule: 

(m:g', m fresh) 

P-9 

{ym:g')P:g 

Following this approach, the whole type system is encoded by means of the 
following inductive predicates (the first for capabilities and the second for pro- 
cesses) : 

Inductive good_msg : cap -> msgType -> Prop := ... 

Inductive good_proc : proc -> group -> Prop := ... 

Due to lack of space we do not report the complete definitions of these predicates 
which are available at www. dimi .uniud. it/“scagnett/Coq-Sources/m3coq. v. 
Intuitively, (good_msg M W) holds iff M has type W and (good_proc P g) holds 
iff P has type g in the current environment. 
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C-Aer P g 2 :G 2 PM-.g I 3iGC(G 2 ) 

rfp (ENV) F h in M :g 2 -> g 2 



(IN) 



FhgiiGi Phg 2 :G 2 F \- M :ffi gi £ C(G 2 ) 5(Gi)C5(G 2 ) 
F h out M:g 2 -> g 2 



(OUT) 



F b g 2 -G 2 / • ffi£g(G 2 ) 

F h to M:pi -> 32 



(TO) 



F h M:g 3 -> g 2 Fh Abfli -A 33 
F h M.N\gi -> 32 



(PATH) 



- (null I FhF:3 F I- 9 : g (rA p' F F 

F b 0:3 (NULL) FhF|Q:3 (PAR) F HP:3 {REPL) 



F h M:3 i — » 32 F h F:3 i 



F h M.P:g 2 (PREFIX) F h (a::W).P:3 

FhP: g FhM:ff F h g:gr(S,C,£, W) 



- l>:g r h g:gr(S,C,£,W) 



(INPUT) 



F P {M).P:g 
fh P:g Fh M:ff F h g-.G g £ 5(G) 



(OUTPUT) 

(AMB) Fm: g F F:ff (AMB RES) 
F h (vm:g )P:g 



r h M[F]:3' 

F,3i:Gi,... ,3 fc :Gfc h F:g gj^GN(P) gt ^ g(l < i < k) 

FhHSPft,.. ,3fc:G fc })-P:ff 



(GRP RES) 



Fig. 1. M 3 typing rules 



2.3 Type Inference Rules 

It is important to notice that we do not provide an implementation in Coq of 
the inference algorithm introduced in [3]. The purpose of our work is to check 
that the type inference rules in Figures 12 and 13 of [3] are sound with respect to 
the typing rules of M 3 . Hence, we work at a logical level showing that for every 
judgment derived using the type inference rules there is a corresponding typing 
judgment obtained by means of the rules in Figure 1. For instance, while in [3] 
in Definition 3 the authors deal with the notion of completion-unifiers and the 
effective way to compute them, we focus on the unification contraints generated 
by the previously mentioned computation process. Indeed, at a logical level we 
can forget about the “real shape” of the generated substitution. In this section 
we illustrate the main ideas behind our encoding. 

The type inference algorithm introduced in [3] starts computing a type from 
a “raw” process, i.e., a well formed process without type annotations and group 
restrictions (since the latter can always be “pulled out” in front of the process 
using structural congruence rules): 

P, Q,R ::= 0 | M.P \ ( M).P \ (x).P \ P\Q \ M[P] \ \P \ ( vn)P 
Thus in Coq we introduce a suitable type raw_proc representing raw processes: 

Inductive raw_proc: Set := 
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raw_nil 

raw_action 

raw_output 

raw_input 

raw_par 

r aw _ ambient 

raw_bang 

raw_nu 



raw_proc 

cap -> raw_proc -> raw_proc 
cap -> raw_proc -> raw_proc 
(name -> raw_proc) -> raw_proc 
raw_proc -> raw_proc -> raw_proc 
cap -> raw_proc -> raw_proc 
raw_proc -> raw_proc 
(name -> raw_proc) -> raw_proc. 



Each constructor corresponds directly to a constructor of type proc, if we do 
not consider the group restriction operator. 

Moreover, since during the type inference process of capabilities some occur- 
rences of group names into the S component of group types are marked with a * 
in order to be able to infer later the correct set of group names where an ambient 
can stay, we need to reflect this fact into our encoding. Hence, we introduce the 
type star_group which admits “starred” group names beside “normal” ones: 



Inductive star_group : Set := 
simple : group -> star_group 
I star : group -> star_group. 

So, (simple g) encodes a “normal” group name g , while (star g) represents 
a “starred” group name g*. It follows that the first component S of group types 
must be a list of elements of type star_group (instead of elements of type 
Glist): 



Inductive starGlist : Set := 
starEmptyG : starGlist 

I starConsG : star_group -> starGlist -> starGlist . 

Thus, group types (with “starred” elements in the first component S ) are 
recorded by means of a new inductive judgment: 

Inductive starGroupType : Set := 

gr_star: starGlist -> Glist -> Glist -> comType -> starGroupType. 

where the only difference w.r.t. the previous predicate groupType is the fact 
that the first argument of the constructor gr_star is a term of type starGlist 
instead of type Glist. 

Since we want to reason about the type environment synthesized by the 
algorithm proposed in [3], we need to introduce a suitable type env in order to 
“manipulate” environments at the object level: 

Inductive env: Set := 
emptyE : env 

I consEgroup : group -> starGroupType -> env -> env 
I consEname : name -> msgType -> env -> env. 
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There are three constructors: one for the empty environment (emptyE), another 
for recording statements like g : gr(S , C, £, T) (consEgroup) and the last one for 
statements like n : g or x : W (consEname). 

The next step consists in the encoding of the operations performed on the 
environments during the type inference process. More precisely, we have to rep- 
resent completion-unifiers, compressions and closures. 

For what concerns unifications, we prefer to not deal explicitly with them 
in order to avoid to get lost into syntactical details. Hence, we represent them 
in form of identity constraints between terms. These constraints are rendered 
by means of Leibniz equalities in higher-order schematic judgments in order to 
be able to use the tactic Rewrite to effectively unify the terms when needed. 
For instance, in the rule (I-Path) of Figure 12 in [3] the unification (f>{{{W,gi — > 
52), (W ,g 3 —> <?i)}) is rendered by requiring the validity of the Leibniz equalities 
W=(cap_type gl g2) and W’=(cap_type g3 gl). Moreover, when there is the 
need of unifying two environments el and e2, we render the completion-unifier 
constraints by means of the judgment (unify_env el e2) where unify_env is 
defined as follows: 

Definition unify_env : env -> env -> Prop := [el:env] [e2:env] 

( (n:name) (W1 ,W2 :msgType) (name_in_env n W1 el) -> 

( name _in_ env n W2 e2) -> W1=W2) /\ 

( (g: group) (SI , S2 : starGlist) (Cl ,C2 ,E1 ,E2 : Glist) (tl ,t2 : comType) 
(group_in_env g (gr_star SI Cl El tl) el) -> 

(group_in_env g (gr_star S2 C2 E2 t2) e2) -> tl=t2) . 

where (name_in_env n W e) holds iff the association between the name 

n and the message type W occurs in the environment e. Similarly 

(group_in_env g t e) holds iff the association between the group name g and 
the group type t occurs into e. Hence, (unify_env el e2) means that el and 
e2 must agree for what concerns the types of names and the communication 
types inside group types referred to the same element. 

The operation of “merging” two (unified) environments is rendered by the 
following predicate: 

Inductive union_env: env -> env -> env -> Prop := 



trivial_imion : 


(e : env) (union 


_env emptyE e e) 




group_union : 


(e,e’ ,e’ ’ ,e ; ’ 


’ : env) (g : group) (S , S ’ 


: starGlist) 




(C,C’ ,E,E’ :Glist) (t:comType) 
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(union_env (consEname n W e) 
e’ (consEname n W e’’)). 

In order to check if (union_env el e2 e) holds, we proceed by structural induc- 
tion on el. The case where the el is empty (trivial_union) is straightforward. 
When the head constructor of el is consEname, we simply “copy” the association 
between the name n and the message type W in the merged environment e. The 
only interesting case is when the head constructor of el is consEgroup, since 
we must search through e2 the occurrence of g adding the components of the 
relative group type to those of the occurrence of el (using the predicates add_S, 
add_C and add_E). Then we remove g from e2 (predicate remove_group) and 
we continue inductively. The definitions of all the previous auxiliary predicates 
can be found at www.dimi .uniud. it/~scagnett/Coq-Sources/m3coq. v. 

Finally the closure operation which computes the correct components S of 
group types contained in an environment (eliminating all the occurrences of the 
marker *) is defined as follows: 

Inductive closure: starGlist -> Glist -> env -> Prop := 
elim_star : (1 : starGlist) (e : env) 

( (g: group) (S : starGlist) (C ,E: Glist) (t : comType) 
(starGlist_isin (star g) 1) /\ 

(group_in_env g (gr_star S C E t) e) -> 
(inc_starGlist S 1) 

) -> (closure 1 (star_clear 1) e) 

I add_grp : (1 , S : starGlist) (C,E,1 ’: Glist) (t : comType) 

(g : group) (e: env) (starGlist_isin (star g) 1) -> 
(group_in_env g (gr_star S C E t) e) -> 

~ (inc_starGlist S 1) -> 

(closure (append_starGlist IS) 1’ e) -> 

(closure 11’ e) . 

where star_clear is the function which erases all the occurrences of the marker 
* into the list passed as argument: 

Fixpoint star_clear [1 : starGlist] : Glist := 

Cases 1 of 

starEmptyG => emptyG 

I (starConsG (simple g) 1’) => (consG g (star_clear 1’)) 

I (starConsG (star g) 1’) => (consG g (star_clear 1’)) 
end. 

The two constructors elim_star and add_grp correspond to the computation 
rules specified in point 1 of Definition 8 in [3]. Indeed, in order to compute the 
closure of an environment T, one has to replace S(G) with S(G) U S(G') for 
every g* G S(G) such that g : C G T and S(G') (£. S(G ) (constructor add_grp). 
Then, when there are no more g* satisfying the previous condition, one can erase 
all the * markers (constructor elim_star). 

Now we are ready to introduce the inductive predicates which encode the 
type inference rules in Figures 12 and 26 of [3]: 
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Inductive msg_inf : cap -> msgType -> env -> Prop := ... 

Inductive proc_inf : raw_proc -> proc -> group -> env -> Prop := ... 

Due to lack of space, we cannot report the complete definitions of these predicates 
which are available at www.dimi .uniud. it/“scagnett/Coq-Sources/m3coq. v. 

3 The Formal Development 

In this section we describe the formal development carried out in Coq. The ulti- 
mate result is the certification of the correctness of the type inference algorithm; 
however, in order to achieve this goal, there are many subtleties to deal with. In 
Section 3.1, we introduce the auxiliary notions and properties we have to supply 
in order to prove the main goal, which we illustrate in Section 3.1. 

3.1 Basic Notions and Properties 

In order to prove the correctness of the type inference algorithm, we need some 
basic properties about environments and the related operations (see Section 2.3). 

The first two must be stated as axioms and allow to infer from the environ- 
ment computed by the type inference algorithm the needed hypotheses in the 
current proof context in order to be able to derive the appropriate typing judg- 
ments (recall from Section 2.2 that type_name and type_group are parametric 
predicates allowing one to record the current associations between names, groups 
and their respective types): 

Axiom TYPE_NAME: (n : name) (g: group) (e: env) 

(name_in_env n (amb_type g) e) -> 

(type_name n (amb_type g)). 

Axiom TYPE_GROUP : (g : group) (S : starGlist) (C,E : Glist) (t : comType) 

(e : env) (group_in_env g (gr_star S C E t) e) -> 

(S ’: Glist) (closure S S’ e) -> 

(type_group g (gr S’ C E t)). 

Then we need some basic properties ensuring that if a given entity (a group name 
or a typing association g : gr(S, C, £ 1 T) occurs into an environment, then it also 
occurs into the result of a merge with another environment or of its closure: 

Lemma UNI0N_IN: (el, e2, e : env) (g: group) (S: starGlist) (C,E: Glist) 

(t : comType) (unify_env el e2) -> 

(union_env el e2 e) -> 

(group_in_env g (gr_star S C E t) e2) -> 

(Ex [S ’: starGlist] (Ex [C’ : Glist] (Ex [E’: Glist] 

Lemma GROUP_IN_CLOSURE: (S : starGlist) (S ’: Glist) (e : env) (g: group) 

(starGlist_isin (simple g) S) -> 

(closure S S’ e) -> (Glist_isin g S’). 
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The previous lemmata depend on some minor results about the decidability 
of equality over group lists, group types (eventually with “starred” groups), 
message and communication types: 

Lemma GLISTJDEC: (1 , 1 1 : Glist) 1=1 ’ \/ *1=1’ . 

Lemma STAR_GROUP_DEC : (s , s 1 : star_group) s=s ’ \/ ~s=s’. 

Lemma STAR_GLIST_DEC : (1 , 1 1 : starGlist) 1=1 ’ \/ *1=1’. 

Lemma MSG_TYPE_DEC: (W,W’ :msgType)W=W’ \/ ~W=W’ . 

Lemma COM_TYPE_DEC : (c , c ’ : comType) c=c ’ \/ ~c=c’. 

Lemma STAR_GROUP_TYPE_DEC : (s ,t : starGroupType) s=t \/ ~s=t. 

All those results are derivable assuming two axioms of the Theory of Contexts [6] 
about the decidability of equality over names and groups respectively: 

Axiom dec_name: (n,m:name)n=m \/ ~n=m. 

Axiom dec_group: (g,h: group) g=h \/ ~g=h 

The latter axioms allow to render in Coq a common assumption about process 
algebras (like the Ambient Calculus and its variant Af 3 ), namely that we can 
always decide whether two names or two groups are equal or not. Indeed, many 
proofs in the literature are carried out by cases on equalities over names. 



Soundness of the type inference algorithm. Since the type inference rules 
are split in two sets: the first for capabilities and the second for raw processes, 
we proved two soundness lemmata: 

Lemma MSG_INF_SOUND : (M:cap) (W:msgType) (e:env) 

(msg_inf M W e) -> (good_msg M W) . 

Lemma PR0C_INF_S0UND : (R:raw_proc) (P:proc) (g: group) (e:env) 

(proc_inf R P g e) -> (good_proc P g) . 

They are proved by structural induction on M and R, respectively. Obviously, 
the former result is needed in order to prove the second one, since processes 
are built on top of capabilities. These two lemmata are the formal equiva- 
lent of Theorem 6 of [3]. The complete Coq code is available at the URL 
http : //www . dimi .uniud. it/~scagnett/Coq-Sources/m3coq. v. 

During the proof development of MSG_INF_SOUND we spotted a minor error 
in the original definition of rule (I-Out) in [3]; indeed, it is stated as follows: 

I-/ out x-{92 -t 92;{x-9i,92-gr({gt},{gi},<D,t)}) 

while it should be 

I “/ out x-{g 2 -t 32! {x: 3 i: 3 r( 0 , 0 , 0 ,t), 5 2 : 5 r({ 3 *},{ 3 i}, 0 ,t)}) 

otherwise, the clause S(Gi) C <S(G 2 ) of rule (OUT) in Figure 1 cannot be 
satisfied, since no group types will be associated to g\. 
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4 Conclusions 

In this paper we encoded the syntax and the type system of a variant of the 
original Ambient Calculus which replaces the potentially dangerous open prim- 
itive with a new instruction to, moving lightweight processes without enclosing 
them into an ambient. Moreover, we provided a formal representation of the type 
inference rules introduced in [3], proving that they are sound w.r.t. the original 
type system. 

The novelty of the approach used in this paper is the treatment of unifica- 
tions by means of schematic judgments involving Leibniz equalities, since this 
approach allows us to avoid an explicit implementation of the machinery under- 
lying the theory of most general unifiers. Indeed, Leibniz equality corresponds 
to /3cU-equality in Coq and this fact allows to rewrite the terms involved in the 
unifying constraints as needed during the proof development. 

Future work. The material reported in the present paper is part of a larger work 
in progress about the encoding and formal development of the metatlreory of the 
Ambient Calculus both in the original typeless form and in other typed versions 
and variants. As a consequence we are still involved in the activity of proof 
development, since some minor technical results are still to be proved. However, 
we are confident to finish the whole formal development soon, since, according 
to our experience, the higher-order approach has proved to be very fruitful when 
applied to the formal representation of process algebras [7, 10]. 

An interesting issue to take into consideration in the near future is to formally 
address the completeness of the type inference procedure introduced in [3] . 
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Abstract. This paper is part of a research project where we are ex- 
ploring methods to extend the computational content of various systems 
of typed A-calculus adding new reductions. Our previous study had its 
focus on isomorphisms of simple inductive types and related extensions 
of term rewriting. In this paper we present some new results concern- 
ing representation of finite sets as inductive types and related algebraic 
structures. 



1 Introduction 

It is routine to mention user-defined computation rules when the question of 
specifying type theories for proof development and verification is considered. At 
the same time the requirements concerning the general form and properies of 
these rules are never, to our knowledge, described precisely. All concrete exam- 
ples usually belong to relatively narrow well studied set. 

This paper is part of a research project where we are exploring methods to 
extend the computational content of various systems of typed A-calculus adding 
new reductions. Our project may be regarded as one of the approaches motivated 
by the convergence between computer-assisted reasoning and symbolic compu- 
tation (cf. [1]). Clearly, it would be more important to develop this approach in 
case of higher order and dependent type systems, but many interesting aspects 
are already present on the level of simply typed calculus with inductive types, 
while technical difficulties are considerably lesser. 

Our previous study focused on isomorphisms of simple inductive types and 
related extensions of term rewriting. In this paper we present some new results 
concerning representation of finite sets as inductive types and related categorical 
and group-theoretical algebraic structures, but the example of isomorphism is 
still useful to make some general observations. Consider the following (standard) 
definition. 
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Definition 1 . Two types p and r are isomorphic if there exist f : p — »• r and 
g : t — y p such that f o g ~ id T , g o f ~ id p . Then, g is denoted f ~ l and called 
the inverse of f and one writes f : r = p : f . 

To apply it in case of typed A-calculus we need: 

— a composition operator o : (r — > v) x (p — > r) — > (p — > v) (with r, v , p types); 

— a term id T representing identity; 

— and equivalence relation ~ on terms. 

Usually these data are subject to certain assumptions: 

- The operator o is defined for all types r, v, p and terms s : r — > v, t : v — > p. 
This assumption seems natural if the whole A-calculus is turned into category 
but it is not if we want to consider categorical (or other algebraic) structure on 
some part of it, cf. our section 4. 

- o is associative up to This condition is closely connected with transitivity 
of isomorphism relation. Thus it seems less natural to omit it completely but it 
may be restricted to some subset of all types. 

- The term id T exists for every r. Same remark as above will apply. 

- For any / : p — > t one has / o id p ~ / and id T o / ~ /. Same remark. In 
particular, if some subset of the whole A-calculus is considered then id T needs 
not to be Xx T .x. 

As to the relation ~, usually — though not necessarily — id T = Xx T .x and 
~ is based on an underlying reduction: for instance, / ~ g means that / and g 
have the same normal form w.r.t. certain system of reductions (cf. [2,4]). 

In previous works, we studied isomorphisms of inductive types (i.e., recursive 
types satisfying a condition of strict-positivity) in an extensional simply-typed 
A-calculus with product and unit types. The o and id T were defined in ordinary 
way as gof = A x T (g(fx)) and id T = Xx T .x. It was shown that the calculus enjoys 
strong normalization and confluence. Note that if ~ means the corresponding 
equivalence relation, the provability of the equivalences V x p .g(f(x)) ~ x and 
v y T -f(g(y )) ~ y doesn’t imply in general / o g ~ id T , go/~ id p . 

At this point we extended the calculus with new conversion rules ensuring 
that all inductive representations of the product and unit types become isomor- 
phic, and the extended reduction relation remains convergent. Finally we defined 
a notion of faithful copy of an inductive type (called isomorphic copy in [6]) and 
a corresponding reduction relation which also preserves the good properties of 
the calculus. 

In this paper, we study some other kinds of extensions of reduction systems. 
It may be regarded as a first step towards a more efficient treatment of represen- 
tations of categories and other algebraic structures (groups, G-sets, semi-groups, 
monoids etc) within typed A-calculi including inductive types. Our first inspira- 
tion was the consideration of a group action on differential equations of certain 
types (in order to carry out a formal development in a proof-assistant). The 
equations were represented by vectors of parameters (regarded as elements of 
some inductive type) and the group itself by operators (symmetries) acting on 
coefficients. (See [3].) 
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In this paper we consider “finite types”, i.e., inductive types of the form 
u n = /ia(ci : a, ...,c n : a) and functions between these types. The principal new 
reduction is / o g — > fog ( “lifting” of composition to inductive types) . Here 
/ : {l,.,?r} — > and / : ya(ci : a,...,c n : a) -4 pa(c\ : a,...,d m : a) 

f = d c /r i ) 5 - -• ? c /(r?) D ) - The particular case of f,g : u> n — > u>„ corresponds to the 
group of permutations S n . We show that the extended calculus is convergent, 
consider other possible reductions and the consequences of this fact in the context 
of faithful copies of inductive types. 

We conclude by a brief outline of future work. The incorporation of reduc- 
tions that will support the functoriality of the inductive type construction w.r.t. 
parameters will open the way to a more efficient treatment (based on term 
rewriting) of group representations in proof assistants. One may notice that iso- 
morphisms correspond already to the representation of inverse elements. Func- 
toriality will be necessary to have associativity of composition. The reductions 
related to the form of presentation of a group based upon generators and rela- 
tions, and the properties of corresponding term rewriting systems, will be the 
task to be studied in a near future. 

2 The Simply- Typed Lambda-Calculus with Inductive 
Types 

In this section, we define X 1 {3rji 1 a simply-typed A-calculus with inductive types 
and structural-recursion operators over them, taking most of our inspiration 
in [7] and in [8]. 

We will consider given infinite sets of constructor names (Const), term vari- 
ables (Var) and type variables (TVar), with Const fl Var = Const n TVar = 
Var fl TVar = 0. We will reserve the letters x, y and z for term variables, a 
and (3 for type variables, r, s, t, u and v for arbitraty terms, p and r for ar- 
bitrary types, and k for constructor schemas. The letters i, j, k, l will only be 
used for indexes and, respectively, n, m, p , q for their upper bound. Finally, con- 
structor names will be denoted either by Ci, c-i, ■ ■ . , c^, d 2 , ■ . . or by the generic 
name in. Definitions will be introduced by the symbol =, as in id = \x T x. 
Terms and types will be considered up to a-congruence (that is, names of bound 
variables are meaningless), and this last relation will be denoted =, thus one 
has \x T ■ x = Xy T ■ y. Sequences of types or terms (h)i= i, n will be written 
with the usual vectorial notation t, (or t if the index is not important), and 
their length will be written |. Using this notation, we will sometimes write 

— )■ t to mean p\ p n — >■ r, associated to the right. Furthermore, 

s e It will mean that there is an i such that s = U, and ft £ S will mean 
that all the tfs belong to the set S. We shall sometimes write vectors such as 
tij, meaning thus that we have a sequence of sequences, that is terms (or types) 
tip, . . . , ti mi , . . . , t n p, . . . , Finally, if some indexes depend on other ones, 

the former will be themselves indexed by the latter, as in t, h which stands for 
tj 1 , . . . , tj n , with 1 ^ i ^ n. We will also need a notion of “curried” compo- 
sition: for given A-terms f '■ ~pt —> t and g : r — > v, g o / will be defined as 




Some Algebraic Structures in Lambda-Calculus with Inductive Types 341 



Azf' g (/ zf), with ^ F V(p) and it ^ FV(/). We shall also use the following 
notation, provided of course that g and / are of suitable types: 

I go f if g and f are composable, 
g • j = < 

I g f otherwise, and g can be applied to /. 

Definition 2 (Prototypes). The grammar of prototypes is defined as follows: 
t ::= a|l|rxr|r— >• r | plf {if : T^), with if £ TVar . 

Definition 3 (Types). We define simultaneously: 

— the set Ty of types: 

P,r £ Ty p,T € Ty 

1 G Ty p x t £ Ty p — > t £ly 

if £ Const if £ TVar if £ Sch(T/) 
plf {if : if) £ Ty 

— and the set Sch(a) of constructor schemas over type variable a: 

0 ~ff , . . . , off £ Ty 

(f — f ((J — )■ cf) — y . . . — i {(j j') — of) — )■ o: £ Sch(a) 

As usual, constructor names can only belong to one inductive type. Thus, an 
inductive type is also defined by the names of its constructors. 

For the sake of readability, all constructors and constructor types will be 
indexed by 1 ^ k ^ p, and all operators by 0 ^ i ^ n and 0 ^ j ^ mi. Remarks: 

— An inductive type is a recursive type built from a sequence of (constructor) 
schemas. 

— Every schema over a is of the form ~jf — > {of) —> a) —>■... — » {off — > 

a ) —> a, and each premise is called an operator over a. The number of oper- 
ators in a schema (/constructor type/constructor) is denoted ar(refc) (arity). 
We write nb P {nif) for the number of p’s and nb R {nk) for the number of op- 
erators — > a, thus we have ar(Kfc) = nb p {nk) + nb R {nk) { since 1 < i < n 

we have nb 7? (Kfc) = n). 

• The p’s are in Ty, which implies they don’t contain any free type variable. 
They are called parameter types. By definition of schemas, parameter 
types can only occur at the beginning of a schema: this restriction is 
useful for technical reasons, most notably for the typing of recursors and 
the definition of their computation rules. It will be clear to the reader 
that this is a minor restriction which does not impair the system at all. 

• In every operator of the form of t -1 a, the <jj i are in Ty (for all i and jf), 
which enforces the a’s to occur only strictly positively in the schema. This 
sort of operator is recursive : more precisely, it is a O-recursive operator 
^ \Th\ ~ 0 * an( f a f -recursive operator otherwise (by analogy with 
functionals of types 0 and 1 in Godel’s system T ). 
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• In order to enhance the readability, we will often denote inductive types 
/.ia it by y. 

Definition 4 (Terms). The set of terms is generated by the following grammar 
(with x £ Var. fcgN \ {0} and r,y£ Ty): 

t::=x |*| {t,t) pXT | p p 1 XT | p£ xr | A x T t \ (t t) \ in£ \ , 

Here in^ is the fc’th constructor of the inductive type y (in practice, we actu- 
ally have constructor names c £ Const), and (7*|) A1 ’ T is a recursor (or structural- 
recursion operator ) from y to another type r. 

Definition 5 (Step Type). Given inductive type(s) y = yalt and a result 
type t, we define for every n k = ~jt — y (<fj( — y a) — y . . . — y (off — y a) — y a in 
Sch(a) the step type 

5 P ’ — ft — y (ex ( — y yf — y ... — y ( o j '( — y /4) y (u — y t) y ... — y ( o — y t) y t 



Definition 6 (Typing). We define by induction the typing rules of the calculus: 



Var 



T, x : t \~ x : t 



Nop 



fh*:l 



T b t : p T h u : t T h t : p x r 

PAIR rh^ptpxr FST r h P? XT t : /> 



Snd 



T \- t : p x t 



Lambda 



T, x : p h t : t 
r h Xx p • t : p — y t 

c £ Const 



App 



In 



r \~ c k : K k [y] 



Reg 



rht:p->T T h u : p 
r \~ (t u) : t 

Tht k : 5£ T 






Reduction. We take most of our terminology and notation in [9]. Given a 
binary relation R on a set A, we will denote the induced rewrite relation — >r, 
but shall be a bit loose and will often write R for — y^ and vice-versa. We will 
respectively write — y#, — y^, and =r for its transitive, reflexive-transitive, and 
reflexive-symmetric-transitive closures. We say that a term t rewrites to u if there 
is a term u such that t — u, and that it reduces to u if there is a derivation 
t -^-r u. The union RL) S of binary relations on a same set will be denoted RS. 
We also write R; S for the set { (r, s) | 3 1 ■ r R t A t S s} . A term is in normal form 
if it not rewriteable. A rewrite relation R is strongly normalising ( terminating ) 
if there is no infinite derivation t\ — >r t 2 — >r . . . , for any term t\. 

Given two rewrite relations R and S: R commutes with S if < — s; — >r C 
— y#; X — 5; R commutes locally with S if < — 5; — >r C — y R - < — 5; R commutes 
strictly locally over S if < — 5; — > R C — ty/j; <-—s- 
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This definition is made in [10], and by R. Di Cosmo in [11] to state Akama- 
Di Cosmo’s Lemma under the name of (DPG) condition (see Lemma 1 on the 
following page) . 

A relation R is confluent (resp. locally confluent) if it commutes (resp. com- 
mutes locally) with itself. A strongly normalising and confluent relation is said 
convergent. We will also write R/S to represent tlr quotient of a relation R by 
the reflexive-symmetric-transitive closure of S. 

The usual notion of substitution will be written t{ u /x}, to mean that u re- 
places every free occurrence of x in t, avoiding capture. Finally, as usual in this 
kind or work, we will consider contexts , written C[]), that is terms with a “hole” 
inside them which can be filled (giving for example C[( Xx T ■ p) q]. 

Definition 7 (/ 3 -conversion). We define the relation of /3-conversion by the 
following rule: (/3_>) ( \x T ■ t) u — t{ u /x}; (/3 Xl ) p pxr ( t,u) pXT — >/ 3 Xi t; 
(/3 X2 ) P2 x r ( t,u) pXT — u, and we write /3 Xl2 for /3 Xl U /3 X2 , and /3 for 
/3— s- U /3 X lj2 - 



Definition 8 (77-conversion). We define the relation of reconversion by the 
following set of rules: t — Xx T -t x if t : t — » v, t is not in applicative 

position, x F V(t); (g x ) t — > Vx (p[ P t, P2 p t) TXp if t : r x p t is not a pair t 
is not projected; (ip) t — > Vl -k if t : It ^ * We’ll write 77 _^ x for r /_> U rj x , and 
77 for ?7^ >x U 771 . 



Definition 9 (/-conversion). Let p = pa it, and Kk = fit —> (oft —>a)—>- 
. . . — t (off —> ot) — > a over a in /i. Given a term we write for the 

n = nb R (nk) recursive arguments it contains. The reader will recall that we have 
up '■ Th h f or an y 1 ^ i ^ n. Then, we define /.-conversion by the rule 






t k iL(<\Ty 



l )- 



Remark 1. Recall that g • / is just an abbreviation. Hence, we could describe 
(.-reduction as (|'t > |) A1 > T (in^T^) — tk it A(up). where 



A(uf) 



d T D p ’ T up if up : pe t (i.e, up is 0-recursive) 

d P ’ T o up if up : oft — > p^ (*.e, U P is 1-recursive) . 



Often we will write just <\i\) (\n k lt) — > L tk it ((|rj) ■ 



The A-calculus thus defined, together with /3r/ /.-conversion is called A 1 /3?7i. 

In the rest of the paper, we will often omit type indications, except for ab- 
stracted variables, to lighten the notation. 
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3 Earlier Results 

Detailed proofs of our results presented in this section can be found in [5] 

Convergence of /3r]t - conversion The following lemma due to Y. Akama (fur- 
ther simplified by R. Di Cosmo) is of great use when one wishes to consider 
adding expansional rules to a rewrite system. 

Lemma 1 (Akama-Di Cosmo’s Lemma). Let R and S be two convergent 
relations, such that R preserves S-normal forms. Then RS is convergent if R 
commutes strictly locally over S ( [12], [11])- 

The simulation technique devised by [13] (which consists in finding a trans- 
lation, between rewriting systems lifting certain properties) is also very useful. 



Theorem 1. [3^ l- conversion is convergent. 

This is a well-known result: strong normalisation is generally proved (often 
on stronger calculi, such as extensions of Girard’s system F ) using (variants of) 
reducibility candidates, see [14,7,15,16]. Confluence follows from the fact that 
/3-^-conversion is an orthogonal higlrer-order system, see for example chapter 8 
of [15] and [17]. Using simulation technique and theorem 1 one obtains 

Theorem 2. fj l- conversion is convergent. 

With help of the following proposition several useful results may be obtained. 

Proposition 1. (cf. Prop. 21 from [18]) Let R be a strongly normalising left- 
linear rewrite relation, generated by rules that do not contain the term * in their 
left-hand side. Then R U r/i is strongly normalising. 



Lemma 2. /3r]iL- conversion is strongly normalising. 



Lemma 3. /3r]iL- conversion is confluent. 



Theorem 3. (3i)i-conversion is convergent. 

Our term system is slightly different from Di Cosmo’s and some effort is 
necessary to reuse his proof in [11]. Our proof in [5] uses lemmas 2 and 3, Akama- 
Di Cosmo’s Lemma and case analysis (to show that / 3 ? 7 it-conversion commutes 
strictly locally over 77 -^ x -conversion). 

Inductive representation of product and unit types. An interesting 
question is to know whether we may have p x r = LI{p, r) (for any types p and r) 
and 1 = U. Unfortunately, it is not immediately the case: for the product, we 
would have functions f : (p x r) — > LI{p, t ) and g : 77(p, r) — > {p x r) defined as 

/ = Xp pXT ■ pair p,T ( Pl p) (p 2 p) and g = <\Xa p Xb T • (a,b)\> n ^’ pXT , 
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and then go f = v ^ id pXT but 

f°g = A z n{p ' T) ■ f ( g z) = 0 ^ \z n(p ’ T) ■ pair p ’ T (Pi {g z)) (p 2 (g z)) ^ 0r]l idn^.r) 

because the calculus lacks an analog of g x for 77 (p, r). The same situation arises 
for 1 and U. 

One solution is to add such an g rule for any inductive representation of the 
product or of the unit type. Recall now that our calculus features the possibility 
(though unused in the paper until now) to name the constructors of an inductive 
type. As a consequence, we should add such an rule for all representations of the 
product and the unit types (all these representations shall become isomorphic). 

Definition 10. For any types 77(p, r) = pa(pair p,r : p — > r — > a) and U = 
pa (nop : a), we define the following conversion rules: 

\v x ) t — pair p ’ r (proj pxr t ) (proj£ xr t) if t : 77(p,r), t ^ pair p ' r sis 2 , t ^ 

■pXr 

pmj p s; 

(iq) t — > Vl nop if t : U , t nop. where proj PlXP2 A 

(Aa; Pl Aa: 2 2 • x i \j n ' yPl ’ P2 ' > ’ Pi ; and we write v for v x Uq. 



We consider the properties of strong normalisation and confluence for the 
calculus with /3?;t^-conversion, which we will call 

To prove convergence for this set of conversions, we roughly follow the same 
procedure used to prove this property for A 1 /??^. It is quite natural as v x - and 
iq-conversions are very similar to r\ x - and r/i-conversions. 

Lemma 4. (dgiLVi-conversion is strongly normalising. 



Lemma 5. fig-iivi-conversion is confluent. 



Theorem 4. fhiLv-conversion is convergent. 

As we wrote earlier, in an “actual” setting, many different instances (with 
different names of types and constructors) of inductive representations of the 
product and unit types may occur. In this case, as many forms of ^-conversions 
must be added, let us call them n 2 , n 3 , etc. 

We prove, by a simple inductive argument, that adding these conversions 
to /3?7i-conversion will keep the convergence property. 

Recall that, in X 1 ^, any isomorphism is obtained by finite composition of 
the following base of 7 isomorphisms ( [4]): 

p x t = t x p p x (r x v) = (p x r) x v (/)Xt)-)o = /)-)(t-H)) 

p — ) (t x «) = (/; - > t) x (p — > v) p x 1 = p p — x 1 = 1 1 — > p= p 

As X 1 /3r]L and A 1 /?^^ contain \ l f3g, the 7 isomorphisms still hold, though 
these are not anymore — of course — the only ones. However, note that this is 
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an interesting result, as isomorphic types may occur in any inductive type (see 
the next section). 

Furthermore, every inductive type similar (with different constructor names) 
with II (p, t ) and U is now (respectively) isomorphic to p x r and 1, and thus 
inherits all CCC-isomorphisms. 

Copies. The CCC-isomorphisms holding on inductive representations of the 
product and unit types are rather “structural”. One may then wonder whether 
less structural isomorphisms (corresponding for example to set-theoretic Injec- 
tions) may hold computationally between inductive types. 

Unfortunately, many isomorphisms of inductive types can’t be “internalised” 
through a conversion relation. For instance, one may wish to have r = p,a (r — > a) 
for any type r. This doesn’t seem to be easily feasible: various new conversion 
rules that seem appropriate, combined with ^(.-conversion yield a non-confluent 
rewrite system. 

Similarly, adding computational rules to have 77( N, N) = N or A7(N, N) = N 
creates serious problems. Therefore, the best we can do is add specific rules giving 
us as many isomorphisms as possible. Here comes into play the notion of faithful 
copy 1 

Definition 11. Let he given types 7 r and ir' , with f : ir — > n' , f : id — » n, 
and inductive types ip = p,a (it : it) and ip' = p,a (c : (?). Then ip' is called a 
copy of ip if it differs from ip only by the names of its constructors, and by the 
fact that zero or more occurrences of n in ip are replaced by n' in ip’ , where n 
and 7 r' may only appear as parameters or no deeper than the whole domain of 
1-recursive operators. 

In [5] we consider only the case when / : 7r = n' : f i.e., /, f are mutually 
inverse intensional isomorphisms. The copy is then dubbed faithful. 

Remark 2. We don’t consider here the possible task of reordering constructors 
so that Ci matches c' for all 1 ^ i ^ n. 

For ease of reading, we will fix the symbols 7r, 7 r', p , ip' , "c^, it, f 

and f appearing in the previous definition, and will consistently use them till 
the end of this section. 

If f : 7 r = 7r' : /', faithful copy is just a special case of isomorphism, which 
is provable but not computable. In [5] was shown that it is possible to add a 
new corresponding y-conversion rule making the isomorphism intensional, and 
such that the underlying conversion relation of the resulting calculus A 1 /3rju'x 
remains convergent. 

To describe this notion of choice, we will write (/ ? x) to denote (/ x) if x : 
7 r and x corresponds to an occurrence to be replaced, and just x otherwise. 
Similarly, jo ? /- will mean g o f if the domain of g is to be replaced, and just g 
otherwise. 

1 We first introduced this notion in [6] under the name of isomorphic copy. 




Some Algebraic Structures in Lambda-Calculus with Inductive Types 347 



Let us now define the function fc : p — > p' (fc : p' p can obviously 
be defined conversely). The reader will notice that this procedure can be au- 
tomatised. Unformally, we have fc (cp- pf r?) = c' k p s r { where p' s = f l p s 
and r- = (fc • rf) o 7 \ (We make a difference between parameters p s : p s , 
for 1 ^ s ^ nb p (Kfc), and recursive arguments ry : of p£ t , with 1 ^ i ^ n.) 

Example 1 . Take / : N = P : /' (what type is P is of no importance) with p = 
pa (ci : a, C2 : (N — » N) — >■ a, C3 : N — y ot — y a, C4 \ (N — > a) — > a) and p' = 

pa (ci : a, c ' 2 '■ (N — > N) — > a, c ' 3 : P — > a — > a, c ' 4 : (P — > a) — > a). 

Then the definition of fc : p — > p' is: 

fc ci = ci fc (03 h t) = C3 (/ h) (fc t ) 

fc (C 2 k) = ci k fc (04 k ) = ci (fc o k o /') . 

Definition 12. (Formal definition offcp — > p' . ) We have fc = with tk 

being the normal form of Xpft ■ c! k (/ ? p) (so 7 / ,? j. 

Proposition 2. If f : n = n' : f then fc : p = p' : fc' is a provable isomor- 
phism. 

The principal result (in case when f : tt = n' : f ) is that the calculus 
extended by two conversion rules fc 7 (fc x ) — > x x and fc (fc 7 a;) — > x x (we 
shall call them y-conversion rules) is convergent. 

Observe that y-conversion is convergent. We show that /3?7i!/y-conversion is 
strongly normalising, and then prove that it is confluent using Newman’s Lemma. 
(It doesn’t seem possible to make use of Akama-Di Cosmo’s Lemma.) 

In order to show strong normalisation, we used the following property which 
we already used in [6] under the name “deferment” (the better name “adjourn- 
ment” is due to D. Kesner). 

Definition 13 (Adjournment). Given two binary relations R and S, S is 
adjournable w.r.t R if S: R C R: (RS)*. 

Lemma 6 (Adjournment Lemma). Given two strongly normalising relations 
R and S, RS is strongly normalising if S is adjournable w.r.t R. 

Proofs of (variations of) this lemma can be found in [19,10,20,6]. 

A subtle point of the proof of SN is that there are cases when the adjourne- 
ment lemma can be used only on condition that certain 1-recursive arguments 
of (.-redex are ?7_>.-expanded. The idea is therefore to “insert” suitable 
expansions in a term before y-conversion, so that the adjournement remains 
possible. 

Lemma 7. Let R be some rewrite relation such that (p-reduction opposite to 
) can be postponed w.r.t. every reduction in R ( except p ifp^ € R). Suppose 
there is an infinite derivation B C[t] . . . where the occurrence of t 

may be p^. -expanded. Then, there exists another derivation B ~^Wr C[t\ — > v ^ 
C[t\ ~^*r . . . which is also infinite. 
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We can now state a new lemma, based upon a “conditional” adjournment. 

Definition 14 (Adjournment Modulo ? 7 _s.). Given two binary relations R 
and S, with R satisfying the assumption of previous lemma and 77_> C R, the 
relation S is adjournable w.r.t R modulo 77^ if ( 5 ; R)/r]^ C ( 1 ?; (RS)*)/i 

Lemma 8. Given two strongly normalising relations R and S, with 77_> C R, RS 
is strongly normalising if S is adjournable w.r.t R modulo ? 7 _>. 



Lemma 9. ftp lis u x- convers i° n strongly normalising. 

In its proof we verify the possibility to apply previous lemmas. 

Lemma 10. /3r]Li'X- conversion is confluent. 

As fdrjLVX ' conversion is strongly normalising, it is enough to show that /3t]lux- 
conversion is locally confluent. As (diyLV- and ^-conversions are both confluent, 
we are left with showing that < — x ; — Q < — Pv^x> — ^Pw-^x- This is done 
by careful case analysis. 

Until now, we only considered adding one couple of %-rules, which entail the 
existence of an intensional isomorphism between an inductive type and its faith- 
ful copy. This result can be extended to several other y-rules (call them y 2 , y 3 , 
etc), in order to get a whole “architecture” of isomorphic inductive types. 

It is easy to see that adding these conversions to (3r]Lu u -conversion will keep 

the strong normalisation and local confluence properties. Indeed, writing y u 
for U?i X*, we have the following: 

m n 

Theorem 5. x u -conversion is convergent. 

4 Algebraic Constructions Using Finite Inductive Types 

Category of finite types. Let us consider the type to n = pa (ci : a, ..., c n : a). 
Constructors Cj have the type uj n . Recursors have the form (ti, ..., t n \) UJr " T where 
t\ . T, . . . , t n . T . 

Obviously the type u n as a representation of a set of n elements is not unique. 
For example, the names of constructors may be changed. To avoid confusion we 
use different constructor names in different types. We shall not consider here 
more complex cases, such as finite types as part of an inductive family. 

Let [n] denote the set of natural numbers {l,...,n}. Let some types u> n = 
pa (ci : a, ..., c n : a) and io m = pa (c^ : a , ..., d m : a) be fixed. To every function 
/ : M ~ 1 > [ m ] corresponds the function ...,c'^A u ‘ n ,Um . Let us denote it 

by /. Note that in this approach / itself remains outside the formal system, it 
is just some function defined on the set [n]. 

Let / : [n] —> [m],g : [m] —> [Z] and the types uj n ,oj m ,ui be fixed. If only 
standard reductions are considered then (g o f)t yf f(gt). Indeed, the terms 
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K(i)’-> c s(m)t )(dc7 (1) ,-,c7 (n) Da;) and (|c£ (/(1)) , c£ (/(n)) )x are already nor- 

mal. 

The first question that arises is whether the calculus extended by new re- 
ductions f(gt) -^g (/ ° g)t- is convergent. The following lemma is obvious. 

Lemma 11. 9-conversion is convergent. 

Let us take //^///-conversion as R and 0-conversion as S. 

Lemma 12. (3r]Lv9-conversion is strongly normalising. 

Proof. . It is easily verified that 0-reduction is adjournable. E.g., if 0-reduction is 
followed by (.-reduction C = C[g(fci)] — >g C[{g o f)a] — C[ c g(/(i))] then it ma Y 
be replaced by 2 (.-reductions. In other cases 0-reduction can just be postponed. 



Lemma 13. f3r]Lv9-conversion is confluent. 

Proof. By lemma 12 and Newman’s lemma it is enough to show that /^/.re- 
conversion is locally confluent. Since 0-conversion and /3r/ /.//-conversion are con- 
vergent we have to check only the cases 4— s', I— g\ —>p, 4— g\ 4—g\ Let 

us consider for example the case g; — > L . In fact only the case of overlapping 

redexes is of interest: C[(go f)a\ 4-g C[g(fci)] C[(gd f ( J. 

The fork is closed by one application of i at each side. Other three cases are 
even simpler. 



Theorem 6. fig li/9 -conversion is convergent. 

With 0-conversion (and corresponding equivalence relation on terms) we have 
a possibility to introduce categorical structure on finite types, that is, to define 
a category with the types u> n as objects (due to renaming of constructors it may 
be many for each n) and / for all o/ n ,w m and / : [n] -A [m] as morphisms. For 
this we don’t need the reduction id n — > id Uri where id n : [n] — > [n] is identity 
map and id Un = Xx UJn .x. Indeed, the property / o id =g f =g id o / does hold 
and more strong property id o h = h for all h : x> n — > r and h o id = h for all 
h : t — > cj n is not needed. 

Remark 3. This is the case where one may take id n as id^ instead of standard 
term \x Un .x. It is due to consideration of a part of the whole calculus as the 
support of corresponding category. Note that the term id Un = \x Wn .x has not 
the form / and doesn’t belong to the categorical structure in question. 

Groups of permutations. Similar considerations apply to the case when 
we consider only the functions / with / : [n] — > [n] and the same type u> n fixed 
as the domain and codomain of /. This case is of interest for representations of 
the group of permutations S n within type theory. 

With 0-reduction we may introduce the group with terms / as elements. The 
term id n will play the role of unit and the inverse of / is represented by / -1 . 
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Often in group theory, groups are defined using generators and relations and 
elements are represented as products of generators. When elements of a group 
are represented by terms t : r — > r in type theory, it may be more interesting to 
consider conversions going the opposite way (w.r.t. the case considered above), 
i.e., “splitting” t into composition. 

We shall consider here one case related to the group of permutations. 

It is well known that every permutation / : [n] — > [n] can be represented as 
a product of disjoint cycles. 

More precisely, / is called cycle if there exists some subset £ 

{ 1 , ..., n} such that f(ii) = i 2 , ..., f(ik- 1 ) = ik,f(ik) = h and f(i ) = i if i is not 
in {*i, 

Two cycles are disjoint if the corresponding sets {*i, ..., ik}, {ji, ■ have 
no common elements. 

Product in S n is represented by functional composition of permutations. 

If / : [n] —> [n] then f = f\ o ... o f m where /i, ..., f m are disjoint cycles and 
the cycles that appear in the product are unique. 

Product (composition) of disjoint cycles is commutative but it is possible 
to order cycles (for example, lexicographically) and to have for every / unique 
decomposition / = /i o ... o f m with /i ^ ... ^ f m . 

This suggests to study the conversions ft — >g> instead of — >g 

where / : [n] —> [n] and /i, ..., f m are disjoint cycles of the unique decomposition 

of/. 

Lemma 14. The fir/ lv6' -conversion is strongly normalising. 

Proof. . Obviously ^'-conversion is SN. Assume that there exists an infinite re- 
duction sequence consisting of (3i)lv6'~ reductions. It is enough to obtain a con- 
tradiction with SN for /3??i!/-conversion. 

Since / 3 ? 7 ii'-conversion and ^'-conversion are SN, this sequence consists of 
alternating finite intervals of firiLV- and ^'-conversions. Let us consider the end 
of first interval consisting of ^'-conversions. If last ^'-conversion of this interval 
is followed by any of /3, 77 , reconversions then it can always be postponed. 

The only case that poses problems is C[fci } — C[fi(...(f m Ci)...)] 

“h C[/l(-"(/m-lC/ m (i)))] -^p VLU g' ... 

Lemma 15 (Auxilliary). Assume there is an infinite reduction sequence of 
the form C -^-^^ 9 ' ■■■ and the term C is of the form C[hci\. Then there exists 
an infinite reduction sequence beginning with the reduction C[hci\ — > L C[ch(i)}- 

The algorithm modifying initial sequence may be defined explicitely. It is first 
defined for finite reduction sequences in such a way that it is hereditary w.r.t. 
initial fragments and this permits to extend it to infinite sequences. Then it is 
shown that infinite sequences remain infinite. 

Thus, we insert the conversions -C -1 

The reduction sequence remains infinite. Then ^'-conversion can be postponed 
w.r.t. the whole ‘block’ consisting of m (.-conversions. 

As result, we will be able to show that there exists an infinite sequence of 
/3?7(,J7-conversions. Contradiction. 
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Lemma 16. The pr/LvO' -conversion is confluent. 

Proof. Verification of local confluence similar to lemma 13. 

Thus, we have 

Theorem 7. The fdrjLvO' -conversion is convergent. 

Faithful copy maps. We studied elsewhere [6] the isomorphism reductions, 
i.e., the calculus with additional reductions of the form f'(f(x)) — x where 
/ and f are mutually inverse w.r.t. standard conversions only extensionally 
(on canonical elements of an inductive type). This is the case for / and / _1 
where /, / -1 : [n] — > [n] are mutually inverse permutations. If 0-conversion is 
already added it seems more natural instead of adding isomorphism reductions 
/(/ -1 f) — > t to add one reduction id n t — t for each u> n . 

Note that id n —> v Xx : io n .(id n x) Xx : u n .x = id UJn . If we take mutually 
inverse functions /, / -1 : [n] — > [n] and / : co n — > u>' n , f^ 1 : oj' n — > u n then 
Xx : w n .(/ _1 (/x)) -+e Xx : u n .(id n x ) Xx : u n .x = id Un . 

Obviously w-conversion is convergent. 

Theorem 8. The Ppiudio-conversion is convergent. 

Now the interesting part is to consider the isomorphisms issued from the 
manage of this extension with faithful copy and related reductions. 

Theorem 9. The Pijivdujx-conversion is convergent. 

Proof. It uses essentially the schema of the proof of theorem 5. We have to 
check in addition the cases of y-conversion combined with 9- and cu-conversion 
(both for adjournement/strong normalisation and confluence). They do not pose 
problems due to very ’basic’ structure of terms /. 

Note that using y-conversion one may obtain id n o id n id UJri but it doesn’t 
include w-conversion since we don’t have id n —> id n o id n which is not SN. 

5 Conclusion 

Our purpose was to explore the possibilities to extend the computational content 
of various systems of A-calculus via the extention of reduction systems preserv- 
ing convergence. In this paper we concentrated on the study of finite inductive 
types and reductions related to categorical and group-theoretical properties, but 
the impact of introduction of new reductions is better understood in the context 
of previous work where the isomorphism reductions, reductions related to rep- 
resentation of Cartesian Closed Structure and the notion of faithful copy were 
studied. 

It should be noted that most of the results were proved using abstract tech- 
niques — most of them being due to R. Di Cosmo — instead of more usual ones 
such as reducibility candidates or critical pairs. 




